logo

View all jobs

Data Engineer (Spark/Flink/Scala Engineer)

Mountain View, CA
 
 
We are urgently looking for Data Engineer for our Direct client requirement
 
TITLE: Data Engineer (Spark, Flink, Scala Engineer)
LOCATION: Mountain View, CA
DURATION: 6+ Months
Rate: DOE

Job Description:

Role Purpose:
We are looking for Scala Engineers with experience with batch and/or streaming jobs. We utilize Spark for batch jobs and Flink for real-time streaming jobs. Experience with Hadoop, Hive, AWS S3 is also an asset.
Major Responsibilities:
  • Create new, and maintain existing, Spark jobs written is Scala
  • Create new, and maintain existing, Flink jobs written in Scala
  • Produce unit and system tests for all code
  • Participate in design discussions to improve our existing frameworks
  • Define scalable calculation logic for interactive and batch use cases
  • Interact with infrastructure and data teams to produce complex analysis across data
 
Background, Experience& Qualifications:
  • A minimum of 2 years of experience with Scala and/or Java
  • Required experience with Hadoop, Spark
  • Knowledge and experience with cloud-based technologies
  • Experience in batch or real-time data streaming
  • Ability to dynamically adapt to conventional big-data frameworks and open source tools if project demands
  • Knowledge of design strategies for developing scalable, resilient, always-on data lake
  • Strong development/automation skills
  • Must be very comfortable with reading and writing Scala code
  • An aptitude for analytical problem solving
  • Deep knowledge of troubleshooting and tuning Spark applications and Hive scripts to achieve optimal performance
  • Good understanding/knowledge of HDFS architecture and various components such as Job Tracker, Task Tracker, Name Node, Data Node, HDFS high availability (HA) and Map Reduce programming paradigm. 
  • Experienced working with various Hadoop Distributions (Cloudera, Hortonworks, MapR, Amazon EMR) to fully implement and leverage new Hadoop features
  • Experience in developing Spark Applications using Spark RDD, Spark-SQL, Spark -Yarn, Spark Mlib and Data frame APIs
  • Experience with real-time data processing and streaming techniques using Spark streaming and Kafka, moving data in and out HDFS and RDBMS.
  • Familiarity with open source configuration management and development tools  
Other:
  • Hands on experience and production use of Hadoop/Cassandra, Spark, Flink and other distributed technologies would be a plus
  • Other Technologies
  • Scalatest
  • Gradle/Maven
  • Airflow
  • SQL
  • AWS 
  • Bachelor’s Degree required
Powered by