Parameter-efficient fine-tuning of small language models for code generation: a comparative study of Gemma, Qwen 2.5 and Llama 3.2.
Saved in:
| Title: | Parameter-efficient fine-tuning of small language models for code generation: a comparative study of Gemma, Qwen 2.5 and Llama 3.2. |
|---|---|
| Authors: | Nguyen, Van-Viet1 nvviet@ictu.edu.vn, Nguyen, The-Vinh1 vinhnt@ictu.edu.vn, Nguyen, Huu-Khanh2 khanhnh@tnu.edu.vn, Vu, Duc-Quang1 vdquang@ictu.edu.vn |
| Source: | International Journal of Electrical & Computer Engineering (2088-8708). Feb2026, Vol. 16 Issue 1, p278-287. 10p. |
| Subjects: | Language models, Code generators, Mathematical optimization, Computer software |
| Abstract: | Large language models (LLMs) have demonstrated impressive capabilities in code generation; however, their high computational demands, privacy limitations, and challenges in edge deployment restrict their practical use in domain-specific applications. This study explores the effectiveness of parameter efficient fine-tuning for small language models (SLMs) with fewer than 3 billion parameters. We adopt a hybrid approach that combines low-rank adaptation (LoRA) and 4-bit quantization (QLoRA) to reduce finetuning costs while preserving semantic consistency. Experiments on the CodeAlpaca-20k dataset reveal that SLMs fine-tuned with this method outperform larger baseline models, including Phi-3 Mini 4K base, in ROUGE-L. Notably, applying our approach to the LLaMA 3 3B and Qwen2.5 3B models yielded performance improvements of 54% and 55%, respectively, over untuned counterparts. We evaluate models developed by major artificial intelligence (AI) providers Google (Gemma 2B), Meta (LLaMA 3 1B/3B), and Alibaba (Qwen2.5 1.5B/3B) and show that parameter-efficient fine-tuning enables them to serve as cost-effective, highperforming alternatives to larger LLMs. These findings highlight the potential of SLMs as scalable solutions for domain-specific software engineering tasks, supporting broader adoption and democratization of neural code synthesis. [ABSTRACT FROM AUTHOR] |
| Copyright of International Journal of Electrical & Computer Engineering (2088-8708) is the property of Institute of Advanced Engineering & Science and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) | |
| Database: | Engineering Source |
Be the first to leave a comment!