Job Description
Machintel is a leading B2B marketing services company that specializes in helping businesses achieve their marketing goals through innovative and data-driven strategies. We are seeking a Senior Data Engineer to join our dynamic team. This is an exciting opportunity to make a significant impact in a rapidly evolving industry.
Skills
-
A Bachelor’s degree and a minimum of 5 years relevant experience as a data engineer.
-
Hands-on deployment experience with Hadoop/Spark, Scala, MySQL, Redshift, and Amazon AWS or other cloud base systems.
-
Comfortable writing code in python, ruby, perl, or equivalent scripting language.
-
Experience with Cosmos/Scope, SQL, or Hadoop.
-
At least 3 years of professional work experience programming in Python, Java or Scala.
-
2+ years of Distributed Computing frameworks such as Apache Spark, Hadoop.
Responsibilities
-
Design and develop ETL (extract-transform-load) processes to validate and transform data, calculate metrics and attributes, and populate data models, using HADOOP, Spark, SQL, and other technologies.
-
Lead by example, demonstrating best practices for code development and optimization, unit testing, CI/CD, performance testing, capacity planning, documentation, monitoring, alerting, and incident response in order to ensure data availability, data quality, usability and required performance.
-
Use programming languages such as SAS, R, Python, and SQL to create automated processes for data gathering, cleansing, reporting, and visualization.
-
Implement systems for tracking data quality, usage, and consistency.
-
Design and develop new data products, using languages.
-
Monitor and maintain system health and security.
-
Oversee administration and improvements to source control and deployment process.
-
Prepare unit tests for all work to be released to our live environment (including data validation scripts for data sets releases or changes).
-
Implement performance tuning on the databases based on monitoring.
-
Design and implement data products using Hadoop technologies.
-
Clear documentation of process flow diagrams and best practices.
-
Design and implementation of multi-source data channels and ETL processes
Qualification
-
BS or MS in Computer Science, Computer Engineering, Data Science, or related discipline