Transcrição automática de entrevistas e anotação Universal Dependencies no Corpus Roda Viva

Authors

  • Cláudia Dias de Barros Instituto Federal de Educação, Ciência e Tecnologia de São Paulo (IFSP) https://orcid.org/0009-0003-9388-4297
  • Oto Araújo Vale Universidade Federal de São Carlos (UFSCar), São Carlos, São Paulo, Brasil
  • Gabriela Wick-Pedro Universidade Federal de São Carlos (UFSCar), São Carlos, São Paulo, Brasil https://orcid.org/0000-0002-7332-4482

DOI:

https://doi.org/10.21165/el.v54i1.3851

Abstract

This article presents the research about the automatic transcription of four interviews extracted from the Roda Viva Corpus, formed by 713 interviews from the Roda Viva Program, on TV Cultura. The original interviews were transcribed by journalists, thus acquiring the status of written text, and also presents interventions, such as encyclopedic information about facts and people mentioned. In order to work with oral text, this research carried out a pilot work of automatic transcription of four of these interviews, using the Whisper tool. Subsequently, the interviews were automatically annotated with the formalization of Universal Dependencies and manually reviewed by the Arborator Grew ElizIA tool. Through this work, it was possible to note the syntactic differences present in the original corpus and in the transcribed interviews.

Downloads

Download data is not yet available.

Published

2025-12-17

How to Cite

Barros, C. D. de, Araújo Vale, O., & Wick-Pedro, G. (2025). Transcrição automática de entrevistas e anotação Universal Dependencies no Corpus Roda Viva. Estudos Linguísticos (São Paulo. 1978), 54(1), 29–45. https://doi.org/10.21165/el.v54i1.3851

Issue

Section

Artigos