Data Engineer for Language Technologies (RE2)

Barcelona Supercomputing Center (BSC)
Cataluña
Tiempo completo
hace 3 días

Job Reference

522_25_LS_LT_RE2

Position

Data Engineer for Language Technologies (RE2)

Closing Date

Sunday, 24 August, 2025
Reference: 522_25_LS_LT_RE2
Job title: Data Engineer for Language Technologies (RE2)


About BSC


The Barcelona Supercomputing Center - Centro Nacional de Supercomputación (BSC-CNS) is the leading supercomputing center in Spain. It houses MareNostrum, one of the most powerful supercomputers in Europe, was a founding and hosting member of the former European HPC infrastructure PRACE (Partnership for Advanced Computing in Europe), and is now hosting entity for EuroHPC JU, the Joint Undertaking that leads large-scale investments and HPC provision in Europe. The mission of BSC is to research, develop and manage information technologies in order to facilitate scientific progress. BSC combines HPC service provision and R&D into both computer and computational science (life, earth and engineering sciences) under one roof, and currently has over 1000 staff from 60 countries.

Look at the BSC experience:
BSC-CNS YouTube Channel
Let's stay connected with BSC Folks!

We are particularly interested for this role in the strengths and lived experiences of women and underrepresented groups to help us avoid perpetuating biases and oversights in science and IT research. In instances of equal merit, the incorporation of the under-represented sex will be favoured.

We promote Equity, Diversity and Inclusion, fostering an environment where each and every one of us is appreciated for who we are, regardless of our differences.

If you consider that you do not meet all the requirements, we encourage you to continue applying for the job offer. We value diversity of experiences and skills, and you could bring unique perspectives to our team.


Context And Mission


The Language Technologies Laboratory at BSC has consolidated experience in several NLP areas, such as massive language model building, biomedical text mining, machine translation and unsupervised learning for under-resourced languages and domains. It has been entrusted by the Spanish and the Catalan governments with the mission to develop fundamental open- source resources and technologies for Spanish and Catalan. In connection with this, the LT Laboratory is currently in charge of two flagship projects at the national and regional level: the ALIA project, funded by the Spanish Secretariat of Digitalisation and Artificial Intelligence, and the AINA project, aimed at developing AI resources for Catalan, funded by the Catalan Digitalisation Department. In addition, the Laboratory participates in various EU funded international projects.
The Language Technologies Laboratory is looking for candidates with a background in computational linguistics with experience in Language Technologies, specifically in Deep Learning and large language model building, and possibly in other areas of Natural Language and Speech Processing.
The successful candidate will work in a highly sophisticated HPC environment, have access to state-of-the-art systems and computational infrastructures, and establish collaborations with experts in different areas at the local and international levels.
The researcher will implement innovative techniques for language modelling and evaluation in the HPC environment.


Key Duties


  • Work, in collaboration with the group members, on the design and development of the solutions needed to achieve the goals of the group’s research projects.
  • Interact with relevant stakeholders of the group’s research projects to understand their problems and the available data to formulate valuable solutions.
  • Ensure the long-term acquisition, management and accessibility of language data through the design and implementation of scalable storage solutions and structured data systems, and processing tools.
  • Collaborate with the members of the group in the generation and evaluation of language models using Deep Learning techniques (Transformers, Recurrent Neural Networks, and other neural network architectures).


Requirements


  • Education
    • Degree in Applied Linguistics, Computer Science or related disciplines with a very strong linguistic background.
  • Essential Knowledge and Professional Experience
    • Native speaker of Spanish.
    • Good knowledge of Python.
    • Good knowledge of Linux.
    • Knowledge of Deep Learning.
    • Experience in Machine Learning techniques applied to NLP.
    • Experience/ knowledge in corpus annotation and generation of linguistic resources.
    • Understanding of data administration and management functions (transfer, storage, analysis, distribution, exploration, etc.).
    • Research experience, with some publications related to language modeling and resources in a multilingual context.
  • Additional Knowledge and Professional Experience
    • Theoretical broad knowledge of AI techniques.
    • Knowledge of HPC workload managers such as Slurm.
    • Knowledge of Continuous Integration/Delivery/Deployment, including tools such as (or similar to) GitLab CI, Github, Docker and/or Ansible.
    • Experience in machine learning and data mining including knowledge of PyTorch, Tensorflow, OpenCV, Pandas, Scikit-learn and/or Numpy.
    • Basic Knowledge of GPU-based computing.
    • Fluency in spoken and written English.
    • Experience in web/data scraping.
    • Expertise in building and maintaining data-curation pipelines.
  • Competences
    • Capacity to explore new research lines.
    • Ability to work independently and collaboratively within multidisciplinary teams.
    • Proactive, detail-oriented mindset, capable of problem-solving in complex data contexts.
    • Good communication and presentation skills.
    • Commitment to deadlines and quality research output


Conditions


  • The position will be located at BSC within the Life Sciences Department
  • We offer a full-time contract (37.5h/week), a good working environment, a highly stimulating environment with state-of-the-art infrastructure, flexible working hours, extensive training plan, restaurant tickets, private health insurance, support to the relocation procedures
  • Duration: Open-ended contract due to technical and scientific activities linked to the project and budget duration
  • Holidays: 23 paid vacation days plus 24th and 31st of December per our collective agreement
  • Salary: we offer a competitive salary commensurate with the qualifications and experience of the candidate and according to the cost of living in Barcelona
  • Starting date: 01/10/2025


Applications procedure and process


All applications must be submitted via the BSC website and contain:

  • A full CV in English including contact details
  • A cover/motivation letter with a statement of interest in English, clearly specifying for which specific area and topics the applicant wishes to be considered. Additionally, two references for further contacts must be included. Applications without this document will not be considered.

Development of the recruitment process


The selection will be carried out through a competitive examination system ("Concurso-Oposición"). The recruitment process consists of two phases:

  • Curriculum Analysis: Evaluation of previous experience and/or scientific history, degree, training, and other professional information relevant to the position. - 40 points
  • Interview phase: The highest-rated candidates at the curriculum level will be invited to the interview phase, conducted by the corresponding department and Human Resources. In this phase, technical competencies, knowledge, skills, and professional experience related to the position, as well as the required personal competencies, will be evaluated. - 60 points. A minimum of 30 points out of 60 must be obtained to be eligible for the position.

The recruitment panel will be composed of at least three people, ensuring at least 25% representation of women.


In accordance with OTM-R principles, a gender-balanced recruitment panel is formed for each vacancy at the beginning of the process. After reviewing the content of the applications, the panel will begin the interviews, with at least one technical and one administrative interview. At a minimum, a personality questionnaire as well as a technical exercise will be conducted during the process.


The panel will make a final decision, and all individuals who participated in the interview phase will receive feedback with details on the acceptance or rejection of their profile.




At BSC, we seek continuous improvement in our recruitment processes. For any suggestions or comments/complaints about our recruitment processes, please contact recruitment [at] bsc [dot] es.
For more information, please follow this link.



Deadline


The vacancy will remain open until a suitable candidate has been hired. Applications will be regularly reviewed and potential candidates will be contacted.


OTM-R principles for selection processes


BSC-CNS is committed to the principles of the Code of Conduct for the Recruitment of Researchers of the European Commission and the Open, Transparent and Merit-based Recruitment principles (OTM-R). This is applied for any potential candidate in all our processes, for example by creating gender-balanced recruitment panels and recognizing career breaks etc.
BSC-CNS is an equal opportunity employer committed to diversity and inclusion. We are pleased to consider all qualified applicants for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability or any other basis protected by applicable state or local law.
For more information follow this link
Postular
Otras recomendaciones de empleo:

Machine Learning and Language Technologies Manager, Siri

Apple
Cataluña
That happens because every one of us shares a belief that we can make something wonderful and share it with the world, changing...
hace 3 días

Research Engineer Data Engineer (RE2)

Barcelona Supercomputing Center (BSC)
Cataluña
  • Be in charge of the design, development and final...
  • 2-3 years of minimum experience in a similar position will...
hace 3 semanas

Deep Learning Engineer for Speech Technologies (RE2)

Barcelona Supercomputing Center (BSC)
Cataluña
  • Supervise licensing and intellectual property of data and...
  • Apply for research projects and participate in the...
hace 1 semana

Data QA Engineer

Frontiers
Madrid, Comunidad de Madrid
  • Estimate, prioritize, and coordinate testing activities for...
  • Define the scope of testing, develop detailed test plans,...
hace 1 semana

Senior Backend R&D Engineer - REMOTE (f/m)

Ansys
Madrid, Comunidad de Madrid
  • Design, implement, maintain and test backend services,...
  • Improve infrastructure resilience using load testing and...
hace 1 semana

Language Engineer (Portuguese)

Apple
Cataluña
The Global Siri Organization is the team that teaches Siri how to understand and speak new languages using machine learning,...
hace 1 día

Data Engineer

IZERTIS
Asturias
Proven experience as a Data Engineer with a strong focus on Azure Databricks and related technologies in the MS data ecosystem....
hace 5 días

Software Engineer, VirusTotal, Google Cloud

Google
Andalucía
Google Cloud's software engineers develop the next-generation technologies that change how billions of users connect, explore, and...
hace 1 semana

Principal Platform Engineer

SGS
Madrid, Comunidad de Madrid
  • Build the Paved Road: Engineer core infrastructure from the...
  • Lead by Building: Act as the lead full-stack or backend...
hace 1 semana

Middle Stress Engineer

Capgemini Engineering
Aragón
This role involves the development and application of engineering practice and knowledge in the following technologies: Design of...
hace 2 semanas