Campus users should disconnect from VPN to access senior theses, as there is a temporary disruption affecting VPN.
 

Publication:

Fine-tuning Small Language Models for Javanese Translation

datacite.rightsrestricted
dc.contributor.advisorFellbaum, Christiane Dorothea
dc.contributor.authorMenezes, Trivan T.
dc.date.accessioned2026-01-05T19:30:22Z
dc.date.available2026-01-05T19:30:22Z
dc.date.issued2025
dc.description.abstractNatural language processing tools remain scarce for low-resource languages, despite their possibly large speaker populations. This research investigates the potential for improving machine translation for Javanese, an Indonesian language with over 80 million speakers but limited digital resources. We analyze the performance of fine-tuning techniques—supervised fine-tuning (SFT), model distillation, and Chain-of-Thought (CoT) distillation—on enhancing Javanese-to-Indonesian translation quality using open-weight models (T5, mT5, Gemma 3 4B, Aya 8B, Aya 32B), comparing them against zero-shot and many-shot baselines from larger proprietary models. Evaluation using BLEU, TER, chrF, and BERTScore reveals that while large models like Gemini 2.0 Flash achieve top performance, fine-tuning significantly boosts the performance of smaller models. The first stage of SFT on the 500-example NusaX dataset caused significant improvement in translation quality. Subsequent model distillation and CoT distillation yielded only marginal improvements over SFT, suggesting diminishing returns potentially limited by pre-training knowledge. The improvements were still tangible, with the fine-tuned 4-billion parameter Gemma 3 4B model achieving performance comparable to, and sometimes exceeding, much larger models like GPT-4o in a zero-shot setup. The results show that fine-tuning smaller, accessible models offers a resource-efficient path to high-quality translation for low-resource languages like Javanese, potentially enabling deployment on edge devices and broadening access to NLP technologies for underserved linguistic communities.
dc.identifier.urihttps://theses-dissertations.princeton.edu/handle/88435/dsp015d86p368h
dc.language.isoen_US
dc.titleFine-tuning Small Language Models for Javanese Translation
dc.typePrinceton University Senior Theses
dspace.entity.typePublication
dspace.workflow.startDateTime2025-12-15T16:55:31.683Z
pu.contributor.authorid920228118
pu.date.classyear2025
pu.departmentComputer Science

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
tmenezes_written_final_report-2.pdf
Size:
687.79 KB
Format:
Adobe Portable Document Format
Download

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
100 B
Format:
Item-specific license agreed to upon submission
Description:
Download