SUMMARY:
-
POSITION INFO:
AWS Data Engineer
Location: Hybrid – Cape Town / Johannesburg
Type: Full-Time
Team: Data Engineering
The client is on a mission to establish one of Africa’s leading engineering hubs. As part of this vision, they’re looking for a seasoned AWS Data Engineer to help design, build, and maintain scalable data solutions in the cloud.
This role is ideal for someone who thrives in dynamic environments, enjoys solving complex data challenges, and has hands-on experience with AWS technologies. The scope of work will vary across projects, so adaptability and a broad technical foundation are key.
Core Responsibilities
Engineering Foundations
- Apply strong software engineering principles across all development work.
- Work confidently in Linux environments and command-line interfaces.
- Use Git for version control and collaborate effectively in codebases.
- Demonstrate proficiency in Python and SQL, with a solid grasp of algorithms, data structures, and design patterns.
Cloud Infrastructure
- Evaluate and implement serverless, managed, and custom infrastructure solutions.
- Use Infrastructure as Code tools like Terraform or AWS CloudFormation to provision resources.
- Align solutions with AWS Well-Architected Framework best practices.
Data Collection & Ingestion
- Build pipelines that handle data from on-premise systems to cloud platforms and vice versa.
- Work with real-time tools like AWS Kinesis (KDS), Kafka/MSK, and near real-time services like Kinesis Firehose.
- Implement batch ingestion using AWS DataSync, Storage Gateway, Transfer Family, and Snowball.
- Integrate with databases using ODBC/JDBC, replication tools, and AWS migration services (DMS, SCT).
Data Storage & Management
- Manage both cloud and on-premise storage systems (e.g., NFS, SMB).
- Store and organize data in formats like Parquet, Avro, CSV, and JSON, with compression and partitioning.
- Work with NoSQL (DynamoDB, MongoDB) and relational databases (RDS, MySQL, PostgreSQL, Aurora).
- Leverage Redshift for MPP workloads and OpenSearch for search capabilities.
- Implement caching strategies using Redis or Memcached.
Data Processing
- Develop robust ETL workflows using Python, SQL, and Spark (e.g., PySpark).
- Use AWS Lambda for event-driven processing and automation.
- Apply Lakehouse technologies such as Apache Hudi, Iceberg, or Delta Lake.
- Utilize AWS Glue for ETL, cataloging, and access control, and manage clusters with EMR.
Data Analysis & Modeling
- Design cloud-based data warehouses and integrate data across systems.
- Apply data modeling techniques including normalization and dimensional modeling.
- Ensure data quality and enable querying via AWS Athena and Glue Crawlers.
Security & Governance
- Implement identity federation, RBAC, and secure access using AWS IAM and STS.
- Secure data in transit and at rest using AWS KMS, SSE, and TLS.
- Configure secure networking with VPCs, endpoints, subnets, and DirectConnect.
Operations & Orchestration
- Automate workflows using AWS Step Functions, Managed Airflow, and Glue.
- Apply architectural principles across operational excellence, security, reliability, performance, cost-efficiency, and sustainability.
Preferred Skills & Experience
- Experience with additional data platforms (e.g., Hadoop, analytics tools).
- Exposure to multi-cloud environments (AWS and Azure).
- Familiarity with containerization (Docker) and CI/CD pipelines.
- Contributions to open-source projects are a plus.
Qualifications
- Bachelor’s or Master’s degree in Computer Science, Engineering, IT, or related field.
- 5+ years in data engineering, with a strong focus on AWS.
- Deep understanding of AWS data services, architecture, and modeling.
- Advanced SQL skills and scripting experience in Python.
- Knowledge of big data tools and machine learning concepts.
- Experience with real-time analytics and enterprise integration patterns.
- Familiarity with DevOps practices for data workflows.
- Strong communication and teamwork abilities.
- Proven success in delivering scalable data solutions.
- Willingness to travel for client engagements.
Certifications (Preferred)
- AWS Certified Data Engineer – Associate
- AWS Certified Machine Learning – Specialty
- AWS Solutions Architect – Associate / Professional
- Databricks Certified Data Engineer / Analyst / ML Associate / Professional
- Microsoft Certified: Azure Data Engineer / AI Engineer / Data Scientist / Solutions Architect / Fabric Engineer
What You’ll Get
- A builder-friendly culture where innovation is encouraged.
- Competitive compensation, bonuses, and incentives.
- A flexible, inclusive work environment that supports growth and balance.
- Opportunities to work on impactful technologies and products.
- Access to mentorship and a collaborative peer network.