Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

KorMedMCQA: Multi-Choice Question Answering Benchmark for Korean Healthcare Professional Licensing Examinations (2403.01469v3)

Published 3 Mar 2024 in cs.CL

Abstract: We present KorMedMCQA, the first Korean Medical Multiple-Choice Question Answering benchmark, derived from professional healthcare licensing examinations conducted in Korea between 2012 and 2024. The dataset contains 7,469 questions from examinations for doctor, nurse, pharmacist, and dentist, covering a wide range of medical disciplines. We evaluate the performance of 59 LLMs, spanning proprietary and open-source models, multilingual and Korean-specialized models, and those fine-tuned for clinical applications. Our results show that applying Chain of Thought (CoT) reasoning can enhance the model performance by up to 4.5% compared to direct answering approaches. We also investigate whether MedQA, one of the most widely used medical benchmarks derived from the U.S. Medical Licensing Examination, can serve as a reliable proxy for evaluating model performance in other regions-in this case, Korea. Our correlation analysis between model scores on KorMedMCQA and MedQA reveals that these two benchmarks align no better than benchmarks from entirely different domains (e.g., MedQA and MMLU-Pro). This finding underscores the substantial linguistic and clinical differences between Korean and U.S. medical contexts, reinforcing the need for region-specific medical QA benchmarks. To support ongoing research in Korean healthcare AI, we publicly release the KorMedMCQA via Huggingface.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (21)
  1. Nori, H., Lee, Y.T., Zhang, S., Carignan, D., Edgar, R., Fusi, N., King, N., Larson, J., Li, Y., Liu, W., et al.: Can generalist foundation models outcompete special-purpose tuning? case study in medicine. arXiv preprint arXiv:2311.16452 (2023) Chen et al. [2023] Chen, Z., Cano, A.H., Romanou, A., Bonnet, A., Matoba, K., Salvi, F., Pagliardini, M., Fan, S., Köpf, A., Mohtashami, A., et al.: Meditron-70b: Scaling medical pretraining for large language models. arXiv preprint arXiv:2311.16079 (2023) Toma et al. [2023] Toma, A., Lawler, P.R., Ba, J., Krishnan, R.G., Rubin, B.B., Wang, B.: Clinical camel: An open-source expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031 (2023) Han et al. [2023] Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Chen, Z., Cano, A.H., Romanou, A., Bonnet, A., Matoba, K., Salvi, F., Pagliardini, M., Fan, S., Köpf, A., Mohtashami, A., et al.: Meditron-70b: Scaling medical pretraining for large language models. arXiv preprint arXiv:2311.16079 (2023) Toma et al. [2023] Toma, A., Lawler, P.R., Ba, J., Krishnan, R.G., Rubin, B.B., Wang, B.: Clinical camel: An open-source expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031 (2023) Han et al. [2023] Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Toma, A., Lawler, P.R., Ba, J., Krishnan, R.G., Rubin, B.B., Wang, B.: Clinical camel: An open-source expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031 (2023) Han et al. [2023] Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  2. Chen, Z., Cano, A.H., Romanou, A., Bonnet, A., Matoba, K., Salvi, F., Pagliardini, M., Fan, S., Köpf, A., Mohtashami, A., et al.: Meditron-70b: Scaling medical pretraining for large language models. arXiv preprint arXiv:2311.16079 (2023) Toma et al. [2023] Toma, A., Lawler, P.R., Ba, J., Krishnan, R.G., Rubin, B.B., Wang, B.: Clinical camel: An open-source expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031 (2023) Han et al. [2023] Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Toma, A., Lawler, P.R., Ba, J., Krishnan, R.G., Rubin, B.B., Wang, B.: Clinical camel: An open-source expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031 (2023) Han et al. [2023] Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  3. Toma, A., Lawler, P.R., Ba, J., Krishnan, R.G., Rubin, B.B., Wang, B.: Clinical camel: An open-source expert-level medical language model with dialogue-based knowledge encoding. arXiv preprint arXiv:2305.12031 (2023) Han et al. [2023] Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  4. Han, T., Adams, L.C., Papaioannou, J.-M., Grundmann, P., Oberhauser, T., Löser, A., Truhn, D., Bressem, K.K.: Medalpaca–an open-source collection of medical conversational ai models and training data. arXiv preprint arXiv:2304.08247 (2023) Dhakal et al. [2024] Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  5. Dhakal, U., Singh, A.K., Devkota, S., Sapkota, Y., Lamichhane, B., Paudyal, S., Dhakal, C.: Gpt-4’s assessment of its performance in a usmle-based case study. arXiv preprint arXiv:2402.09654 (2024) Jin et al. [2021] Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  6. Jin, D., Pan, E., Oufattole, N., Weng, W.-H., Fang, H., Szolovits, P.: What disease does this patient have? a large-scale open domain question answering dataset from medical exams. Applied Sciences 11(14), 6421 (2021) Vilares and Gómez-Rodríguez [2019] Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  7. Vilares, D., Gómez-Rodríguez, C.: Head-qa: A healthcare dataset for complex reasoning. arXiv preprint arXiv:1906.04701 (2019) Labrak et al. [2023] Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  8. Labrak, Y., Bazoge, A., Dufour, R., Rouvier, M., Morin, E., Daille, B., Gourraud, P.-A.: Frenchmedmcqa: A french multiple-choice question answering dataset for medical domain. arXiv preprint arXiv:2304.04280 (2023) Pal et al. [2022] Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  9. Pal, A., Umapathi, L.K., Sankarasubbu, M.: Medmcqa: A large-scale multi-subject multi-choice dataset for medical domain question answering. In: Conference on Health, Inference, and Learning, pp. 248–260 (2022). PMLR of Urogenital Tract Infection and Inflammation [2018] Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  10. Urogenital Tract Infection, K.A., Inflammation: Guidelines for the Antibiotic Use in Urinary Tract Infections, (2018). {}{}}{https://www.uti.or.kr/include/pdf/antibiotic-use-2018.pdf}{cmtt} Colgan et al. [2011] Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  11. Colgan, R., Williams, M., Johnson, J.R.: Diagnosis and treatment of acute pyelonephritis in women. American family physician 84(5), 519–526 (2011) Wolf et al. [2019] Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  12. Wolf, T., Debut, L., Sanh, V., Chaumond, J., Delangue, C., Moi, A., Cistac, P., Rault, T., Louf, R., Funtowicz, M., et al.: Huggingface’s transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019) Gao et al. [2023] Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  13. Gao, L., Tow, J., Abbasi, B., Biderman, S., Black, S., DiPofi, A., Foster, C., Golding, L., Hsu, J., Le Noac’h, A., Li, H., McDonell, K., Muennighoff, N., Ociepa, C., Phang, J., Reynolds, L., Schoelkopf, H., Skowron, A., Sutawika, L., Tang, E., Thite, A., Wang, B., Wang, K., Zou, A.: A framework for few-shot language model evaluation. Zenodo (2023). https://doi.org/10.5281/zenodo.10256836 . https://zenodo.org/records/10256836 Son et al. [2024] Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  14. Son, G., Lee, H., Kim, S., Kim, S., Muennighoff, N., Choi, T., Park, C., Yoo, K.M., Biderman, S.: Kmmlu: Measuring massive multitask language understanding in korean. arXiv preprint arXiv:2402.11548 (2024) Achiam et al. [2023] Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  15. Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S., et al.: Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023) Touvron et al. [2023] Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  16. Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., et al.: Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023) Jiang et al. [2023] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  17. Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., Casas, D.d.l., Bressand, F., Lengyel, G., Lample, G., Saulnier, L., et al.: Mistral 7b. arXiv preprint arXiv:2310.06825 (2023) 01.AI [2024] 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  18. 01.AI: Building the Next Generation of Open-source and Bilingual Llms., (2024). {}{}}{https://huggingface.co/01-ai/Yi-34B}{cmtt} Kim et al. [2023] Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  19. Kim, D., Park, C., Kim, S., Lee, W., Song, W., Kim, Y., Kim, H., Kim, Y., Lee, H., Kim, J., et al.: Solar 10.7 b: Scaling large language models with simple yet effective depth up-scaling. arXiv preprint arXiv:2312.15166 (2023) L. Junbum, Taekyoon Choi [2023] L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  20. L. Junbum, Taekyoon Choi: llama-2-koen-13b. Hugging Face (2023). https://doi.org/10.57967/hf/1280 . https://huggingface.co/beomi/llama-2-koen-13b Lee Junbum [2024] Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
  21. Lee Junbum: Yi-Ko-6B (Revision 205083a). Hugging Face (2024). https://doi.org/10.57967/hf/1708 . https://huggingface.co/beomi/Yi-Ko-6B
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Sunjun Kweon (7 papers)
  2. Byungjin Choi (1 paper)
  3. Minkyu Kim (51 papers)
  4. Rae Woong Park (2 papers)
  5. Edward Choi (90 papers)
  6. Gyouk Chu (2 papers)
  7. Junyeong Song (1 paper)
  8. Daeun Hyeon (1 paper)
  9. Sujin Gan (1 paper)
  10. Jueon Kim (1 paper)
Citations (4)