Data Engineer

Pune Day Remote 3 years experience

Machintel is a leading B2B marketing services company helping businesses achieve their marketing goals through innovative and data-driven strategies. We are seeking a Senior Data Engineer to join our dynamic team. This is an exciting opportunity to significantly impact a rapidly evolving industry.

Skills 

  • A Bachelor’s degree and a minimum of 3 years of relevant experience as a data engineer
  • Hands-on deployment experience with Hadoop/Spark, Scala, MySQL, Redshift, and Amazon AWS or other cloud base systems
  • Comfortable writing code in Python, Ruby, Perl, or equivalent scripting language
  • Experience with Cosmos/Scope, SQL, or Hadoop
  • At least 3 years of professional work experience programming in Python, Java or Scala
  • 2+ years of Distributed Computing frameworks such as Apache Spark, Hadoop

Responsibilities

  • Design and develop ETL (extract-transform-load) processes to validate and transform data, calculate metrics and attributes, and populate data models using HADOOP, Spark, SQL, and other technologies
  • Experience in Cloud technologies like S3, databases and so on.
  • Lead by example, demonstrating best practices for code development and optimization, unit testing, CI/CD, performance testing, capacity planning, documentation, monitoring, alerting, and incident response to ensure data availability, quality, usability and required performance.
  • Use programming languages such as SAS, R, Python, and SQL to create automated data gathering, cleansing, reporting, and visualization processes.
  • Implement systems for tracking data quality, usage, and consistency
  • Design and develop new data products using languages
  • Monitor and maintain system health and security
  • Oversee administration and improvements to source control and deployment process.
  • Prepare unit tests for all work to be released to our live environment (including data validation scripts for data set releases or changes)
  • Implement performance tuning on the databases based on monitoring
  • Design and implement data products using Hadoop technologies
  • Clear documentation of process flow diagrams and best practices
  • Design and implementation of multi-source data channels and ETL processes
  • Working experience with AWS services such as EMR, Athena, Glue, Redshift and Lambda.

Apply Now