«

Dic 04

Imprimir esta Entrada

IATE terminology in MultiTerm format in a free acces database

IATE

IATE‘s inter-institutional terminology database (= “Inter-Active Terminology for Europe”) is the largest accessible terminology repository in the network. Its more than 1,180,000 entries, with more than 7.9 million coded terms, and the fact that it covers a total of 24 languages, have made of it an essential tool in any multilingual environment.

However, at a professional level, consulting IATE involves spending an extra time of which we are not always conscious, especially considering how useful this tool is.

And depending on where we get our connection from, the quality of our connection will also be determining. Not all are first level connections.

There is a second option: Immediate access to the information, in computer-assisted translation environments, is only possible through the corresponding terminological tool (SDL MultiTerm in the case of SDL Trados Studio). Consequently, accessing IATE‘s terminology from the CAT tool implies acquiring data straight from the terminological tool.

And we will reduce the typing time since we will not need more than just two or three keystrokes to introduce the translation in the active segment, even in terms like in the example. And for 8 million of them.

The problem comes when we try to manage the data, in tbx format, that IATE provides on its download page: The file to be managed has a size bigger than 1.2 Gb. A volume difficult to manage with the usual tools. The hypothetical conversion of the English-French languages pair with full data would generate a database of a size bigger than 10 Gb. The tools that come with the CAT programs simply can not manage them.

The way to solve this issue would be to divide it. IATE classifies its terms in a total of 22 categories. These are, then, the most appropriate fragmentations for this terminology collection.

In the smaller categories (by volume of entries and languages) the generated databases would be of a few tens of Mb. (after optimizing them). However, more developed areas could generate databases close to the Gb (also after to optimize them), which would border its utility on CAT programs.

The next phase to be solved would be how to know the supposed usability of those databases. In this case, the tests would be done, for example, on an i7 computer with 12 Gb RAM. And they would consist of the accounting of the terminology’s display time on the SDL Trados Studio window in a Lorem ipsum text that incorporated in each segment a minimum of an existing term in the database. If the average times (measured from the access at the second segment) were less than five seconds (5′ ), the fact of having a database in the format of our assisted translation tool would allow us to type the correct destination term with just two or three keystrokes. Savings would be evident.

There would be, however, a final obstacle: To generate the databases and optimize them. Approximate cost: 1,000 hours. Or just get them at:

https://iate-terminology-multiterm-format.eu.

Salvador Aparicio, AulaSIC director and teacher, and with more than 35 years of experience in consulting and training in translation technologies environment, has designed and implemented a data management protocol, under the conditions established by IATE itself, to generate operational databases in SDL MultiTerm 2017-2019 format, with free access to data (without any protection or encryption) and with all the data included by IATE in its August 2018 distribution.

Using the tools provided by IATE, SDL MultiTerm (to ensure the data’s homogeneity) and a simple text editor we are able to generate terminological databases with all data supplied by IATE in operating groups.

In a few weeks our collection of IATE terminology glossaries in SDL MultiTerm format, distributed under the Creative Commons License NonCommercial-Acknowledgement-WithoutDerivativeWork(CC BY-NC-ND), will cover the entire list of themes and languages, available in the database (22 themes and 24 languages). Now you will find available twenty SDL MultiTerm databases ready for the download and immediate use in an assisted translation environment. (single user license, company downloads… try them first, and then we’ll talk).

Some practical-technical bits of advice

  • Databases up to 400 Mb will run correctly on a computer with the characteristics of an i3 with 4 Mb RAM.
  • Large volume databases (1Gb) will run correctly for the speed test on a computer such as the described above, in lower performance PCs response times will be a little lower.
  • It is recommended to take as a minimum reference an i5 computer with 8 MB RAM.

To conclude. Using the appropriate terminology is always a guarantee of quality and homogeneity on a translation. Using it quickly and comfortably just affects your productivity. And, the higher productivity, greater profits.

And breaking news. At the beginning of the process, we asked IATE for authorization to include its official logo. And this was the answer:

The conversion of the IATE dates to a ready-to-use Multiterm database will surely be of interest to the translators’ community. Given the free access to the data that you are offering, we authorize you to use the IATE logo.

 

 

 

 

 

 

Connect to https://iate-terminology-multiterm-format.eu and give your benefits a boost.

More information about IATE in https://iate.europa.eu/download-iate

Salvador Aparicio
AulaSIC

 

 

 

 

 

 

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *

Puedes utilizar las siguientes etiquetas y atributos HTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*

Uso de cookies

Este sitio web utiliza cookies para que usted tenga la mejor experiencia de usuario. Si continúa navegando está dando su consentimiento para la aceptación de las mencionadas cookies y la aceptación de nuestra política de cookies, pinche el enlace para mayor información.

ACEPTAR
Aviso de cookies