Workshop paper

Multilingual Code Explanation for Mainframe Languages

Abstract

Mainframe systems, written in legacy languages such as COBOL, PL/I, and JCL, continue to support mission-critical applications across various industries. Their complexity and limited documentation hinder maintenance and modernization, especially in regions where localized explanations are essential for accurate understanding. However, existing approaches predominantly generate English-only outputs and rely on resource-intensive models unsuitable for secure, on-premises environments. This study explores multilingual explanation generation for mainframe programs using lightweight language models suitable for constrained enterprise settings. We evaluate two strategies—(a) direct generation in the target language and (b) translation-based generation from English—across five languages: Japanese, French, German, Spanish, and Portuguese. Explanation quality is assessed using BLEU, ROUGE-L, METEOR, and semantic similarity. Preliminary results show that lightweight models can produce semantically adequate multilingual explanations. Translation-based generation generally yields higher lexical and structural quality across languages and models, while direct generation shows promise in specific scenarios. These findings demonstrate the feasibility of deploying multilingual explanation systems in enterprise environments and highlight opportunities to refine generation strategies based on language and code characteristics.