Generación automática de códigofuente a través de Modelos Preentrenados de Lenguaje

Adrián Bender; Santiago Nicolet; Pablo Folino; Juan José Lopez; Gustavo Hansen

doi:10.24215/15146774e002

Automatic source code generation through Pretrained Language Models

Authors

Adrián Bender Universidad del Salvador, Argentina
Santiago Nicolet Universidad del Salvador, Argentina
Pablo Folino Universidad del Salvador, Argentina
Juan José Lopez Universidad del Salvador, Argentina
Gustavo Hansen Universidad del Salvador, Argentina

DOI:

https://doi.org/10.24215/15146774e002

Keywords:

code generation, pretreined model, transformers, automatization

Abstract

A Transformer is a Deep Learning model created in 2017 with the aim of performing translations between natural languages. The innovations introduced, particularly the self-attention mechanism, made it possible to build prototypes that have an intuitive notion of context and understanding of the meaning and underlying patterns of the language. In 2020 OpenAI released GPT-3, a pretrained model focused on language generation, which showed promising results, creating text with a quality that made it difficult to distinguish whether they were written by a human or by a machine. As the source code is text generated in a formal language, it could be generated with tools based on these prototypes. This work presents a study of the evolution and the state of the art in this field: the automatic generation of source code from specifications written in a natural language. We navigate through different cases, their success, the difficulties of finding test mechanisms and their possible implementation in the future by companies.

Downloads

PDF (Spanish)

Published

2023-05-03

Issue

Vol. 22 No. 1 (2023): Special Edition JAIIO 2022- AGRANDA - ASAI - ASSE - SAEI - SIIIO

Section

Papers

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Those authors who have publications with this journal, agree with the following terms:

a. Authors will retain its copyright and will ensure the rights of first publication of its work to the journal, which will be at the same time subject to the Creative Commons Atribución-NoComercial-CompartirIgual 4.0 Internacional (CC BY-NC-SA 4.0) allowing third parties to share the work as long as the author and the first publication on this journal is indicated.

b. Authors may elect other non-exclusive license agreements of the distribution of the published work (for example: locate it on an institutional telematics file or publish it on an monographic volume) as long as the first publication on this journal is indicated,

c. Authors are allowed and suggested to disseminate its work through the internet (for example: in institutional telematics files or in their website) before and during the submission process, which could produce interesting exchanges and increase the references of the published work. (see The effect of open Access)

How to Cite

Bender, A., Nicolet, S., Folino, P., Lopez, J. J., & Hansen, G. (2023). Automatic source code generation through Pretrained Language Models. SADIO Electronic Journal of Informatics and Operations Research, 22(1), e002. https://doi.org/10.24215/15146774e002

Download Citation