Junior Data Engineer

Jr. Data Engineer


We are currently looking for a Junior Data Engineer that will support data harmonization projects. This includes designing and implementing ETL pipelines, and setting up data platform solutions for data pipelines, analytical tools and other applications.


Based on a harmonization design for mapping a client’s healthcare data to a common data model, prepare technical specifications, create prototypes, design, code and test the ETL solution

Develop ETL solutions in Python, SQL, Java, Pentaho DI or other system

Design and implement data quality control protocols and processes

Participate in all phases of Agile development process

Research technologies and build software to support the team in providing first class RWE services to clients

Build data platforms for clients to support data exploration, analytics, visualization and modelling

Required Qualifications

  • Degree

    Advanced technical degree (Master) in Computer Science, Bioengineering, Data Science, Physics, Applied Sciences
    or similar field

  • Experience

    - Some experience as software developer, analyst or similar role
    - Experience working with and extracting and analyzing (clinical) data from diverse data sets
    - Experience building solutions using cloud technologies and DevOps technologies
    - Experience developing ETL solutions
    - Proven ability to write clean, high quality, and testable code

  • Knowledge

    - In-depth knowledge of SQL, and one or more programming languages such as Python, R, Java, C++ and other
    - Good knowledge of Linux
    - In-depth knowledge of database systems and data modelling
    - Familiarity with version control systems

  • Language

    - Excellent verbal and written communications skills
    - Fluent in English, knowledge of Dutch, French and other languages a plus

Preferred Qualifications

  • Experience

    - Experience working in health or life science domain
    - Experience with OMOP CDM and OHDSI tools and libraries

  • Knowledge

    - Familiarity with deploying solutions using Docker and AWS
    - Expert level of Python
    - Knowledge of medical terminologies and controlled vocabularies (ICD-10, SNOMED, LOINC, RxNorm, etc.) used in healthcare data and ontologies
    - Knowledge of healthcare technologies and standards such as HL7, FHIR, DICOM etc.