Data Engineer (AMK)
Permanent
We are looking for a skilled Data Engineer with hands-on experience in workflow orchestration, data integration, and cloud-based big data platforms. The role focuses on migrating and re-platforming existing integration workflows, building reliable ETL pipelines, and contributing to Lakehouse-style data architectures that support both batch and streaming workloads.
The ideal candidate has practical experience working with integration platforms, structured data ingestion, and distributed data processing frameworks, and is comfortable operating in non-trivial enterprise environments with multiple applications, data volumes, and operational constraints.
Key Responsibilities
Integration & Workflow Engineering
- Analyse and re-platform existing workflow-based integrations from one platform to another while maintaining functional parity.
- Implement orchestration logic including triggers, conditional routing, retries, exception handling, and state management.
- Design and implement automation patterns to handle platform limitations (e.g., looping, batching, pagination, throttling).
- Ensure workflows are idempotent, fault tolerant.
- Support environment-based deployments (DEV / UAT / PROD) with configuration-driven designs.
Data Engineering & ETL
- Design and build ETL pipelines for ingesting flat files (e.g., CSV) into relational databases.
- Handle schema validation, basic schema evolution, data quality checks, and error reconciliation.
- Optimize data ingestion for performance, scalability, and reliability.
- Collaborate with application teams to understand upstream and downstream data dependencies.
Cloud & Big Data Platform Contributions
- Build and maintain data pipelines on cloud-based big data platforms using distributed processing frameworks.
- Contribute to Lakehouse-style data storage that supports both batch and streaming data.
- Work with modern table formats that support incremental processing, versioning, and historical queries.
- Support use cases such as append-heavy datasets, high-write event data, and analytical queries.
Operations, Quality &Observability
- Implement logging, monitoring, and alerting for workflows and data pipelines.
- Support operational readiness including runbooks, deployment procedures, and rollback strategies.
- Participate in root-cause analysis and continuous improvement of data pipelines.
- Ensure adherence to data governance, security, and compliance standards.
Required Skills & Experience
Technical Skills
- Strong experience with workflow orchestration / integration platforms.
- Solid understanding of ETL concepts and hands-on experience with file-based ingestion.
- Proficient in SQL and working knowledge of relational databases (e.g., SQL Server or equivalent).
- Experience with distributed data processing frameworks (e.g., Spark).
- Familiarity with streaming and batch data processing concepts.
- Practical experience with cloud platforms and managed data services.
- Understanding of CI/CD principles for data and integration workflows.
Programming & Automation
- Experience with scripting or programming languages commonly used in data engineering (e.g., Python).
- Ability to build reusable utilities for batching, retries, pagination, and error handling.
- Experience with REST APIs and structured data formats (JSON, CSV).
Desired (Good to Have)
- Exposure to modern Lakehouse table formats supporting incremental processing and time travel.
- Experience with stream processing engines (e.g., Flink or Spark Streaming).
- Familiarity with query engines used for analytical access.
- Experience working in regulated environments (banking, financial services, or manufacturing).
- Knowledge of data observability and quality frameworks.
Experience Level
- Typically, 3–5 years of relevant experience in data engineering, integration engineering, or similar roles.
- Proven ability to work independently on moderately complex problems while collaborating within a larger team.
Soft Skills & Competencies
- Strong analytical and problem-solving skills.
- Ability to understand existing systems and translate requirements into working solutions.
- Clear communication with technical and non-technical stakeholders.
- Detail-oriented with a focus on reliability and maintainability.
- Comfortable working across multiple teams and applications.
What Success Looks Like in This Role
- Existing workflows are successfully re-platformed with minimal disruption.
- Data pipelines are stable, observable, and easy to operate.
- Platform limitations are addressed through clean, maintainable automation.
- Data ingestion and processing meet performance and reliability expectations.
- Stakeholders trust the data pipelines and integration workflows in production.