Automatic source code generation through Pretrained Language Models

Authors

  • Adrián Bender Universidad del Salvador, Argentina
  • Santiago Nicolet Universidad del Salvador, Argentina
  • Pablo Folino Universidad del Salvador, Argentina
  • Juan José Lopez Universidad del Salvador, Argentina
  • Gustavo Hansen Universidad del Salvador, Argentina

DOI:

https://doi.org/10.24215/15146774e002

Keywords:

code generation, pretreined model, transformers, automatization

Abstract

A Transformer is a Deep Learning model created in 2017 with the aim of performing translations between natural languages. The innovations introduced, particularly the self-attention mechanism, made it possible to build prototypes that have an intuitive notion of context and understanding of the meaning and underlying patterns of the language. In 2020 OpenAI released GPT-3, a pretrained model focused on language generation, which showed promising results, creating text with a quality that made it difficult to distinguish whether they were written by a human or by a machine. As the source code is text generated in a formal language, it could be generated with tools based on these prototypes. This work presents a study of the evolution and the state of the art in this field: the automatic generation of source code from specifications written in a natural language. We navigate through different cases, their success, the difficulties of finding test mechanisms and their possible implementation in the future by companies.

Downloads

Published

2023-05-03

How to Cite

Bender, A., Nicolet, S., Folino, P., Lopez, J. J., & Hansen, G. (2023). Automatic source code generation through Pretrained Language Models. SADIO Electronic Journal of Informatics and Operations Research, 22(1), e002. https://doi.org/10.24215/15146774e002