Job type full-time
Full job description
You will clean, transform and analyze vast amounts of raw data from various systems using spark to provide ready to use datato our developers and business analysts.
This involves both adhoc and data pipelines request that are embedded in our production environment.
Create scala/spark jobs for data transformation and aggregation
Produce unit tests for spark transformation and helper methods
Write scaladoc style documentation with all code.
Design data processing pipelines.
5 to 6 year experience using spark.
Scala with a focus on the functional programming paradigm
Scalatest junit, mockito
Knowledge on apache spark 2.x, rdd api, spark sql dataframe api, mllib api, graphx api, streaming api
Spark query tuning and performance optimization
Sql database integration ( mssql, oracle, postgres)
Experience working with hdfs, s3, cassandra and dynamodb
Deep understanding of distributed systems.
Nice to have skills:
: full time
: 25/07/2022, 5:07:35 am
Posted 24 days ago