Senior Data Engineer required to work within a large media company to work in a fast paced environment and work with BI and blend with...
The ideal candidate would have a strong working experience of architecting and implementing data processing pipelines on a combination of one or more cloud based data processing and storage systems/services on real-time, high volume and varying formats of data (structured & unstructured); working experience with AWS services like EMR (using Hive, Spark etc.), Data Pipelines, IAM etc. or equivalent in case of non-AWS vendors; and good knowledge of using Java and Python.
- Data warehousing concepts (change data capture, ETL, data marts etc.)
- Building data marts/BI applications for specific business areas from multiple data sources like files, API's, messaging streams (Kafka, Kinesis etc.)
- Using data and analytics technologies/services e.g. Redshift, S3, RDS, BigQuery etc. or comparable technologies
- Programming languages - Java or Python
- Creating ETL pipelines for Data processing workloads for change data capture, set based data transformations, data movement between AWS services
- Implementing, optimising and administering cloud based parallel processing/in-memory/columnar relational databases e.g. BigQuery, Redshift etc.
- Ability to setup and configure ETL frameworks/orchestration tools like Airflow, Oozie etc. using Java or Python
- Events/Stream parallel data processing solutions like Spark on AWS or similar technology
- Data processing using Hadoop/Hbase/Hive
- Data security principles using AWS Security rules/policies for communication between AWS Services and securing PII data stored, transferred or processed on them.
- Working with NoSQL databases e.g. Cassandra, DynamoDB
- Data manipulation languages - SQL, Python
The Data Engineer/Architect role is to help design & build a data pipeline for B2B data analytics & reporting platform that requires ingestion of data from multiple sources of varying formats, velocity and sizes that need to go through data integration and transformation to apply business context and prepare a re-usable data set for business reporting, integrating with CRM systems and other potential internal and external user facing solutions.
You will be responsible for understanding the business outcomes and current technical challenges/limitations and come up with a brand new system architecture to build a Next Generation Data Pipeline that will allow achieving a more real-time, scalable and re-usable data pipeline. Also be responsible for the implementing a prototype of the defined services/design. Help shape the project direction and be using continuous deployment within a genuinely agile team who are striving to deliver quality products with realistic timescales. You will also be encouraged to explore new technologies and approaches that best fit the business problems.
To find out more please apply today!
We are an equal opportunities employer and welcome applications from all suitably qualified persons regardless of their race, sex, disability, religion/belief, sexual orientation, gender reassignment, marriage and civil partnerships, pregnancy or maternity or age