Distributed Systems, Software Engineer (Remote, ROU)

Publicat 15.09.2024 | Expiră 04.11.2024

Descriere job

About The Role:

CrowdStrike is looking to hire a Software Engineer to join the Data Infra Engineering team as a Distributed Systems Engineer. In this team, we are on a mission to create a hyper scale data lake, which helps finding bad actors and stopping breaches. The team builds and operates systems to centralize all of the data the falcon platform collects, making it easy for internal and external customers to transform and access the data for analytics, machine learning, and threat hunting.

As a Software Engineer in this team you will be responsible for building our Flink/Spark ecosystem in Kubernetes, Flink, Kafka, Spark, MinIO, HDFS, Trino, Hive, Pinot etc. We are looking for candidates that have deep understanding of large scale big data scale Distributed systems in the DataCenter or Cloud (PB-scale would be a plus) in the DataCenter/AWS and are passionate about solving problems at high scale. This role involves leading efforts to build State-of-the-art Flink/Spark k8 platform.

What You’ll Do:

Design large scale distributed systems control plane
Implement Monitoring and Visualization system for large scale Kubernetes system
Implement CRD to orchestrate the distributed system.
Build performance monitoring system notification tolling
Communicate problem effectively.
Experience in developing large scale distributed systems
Strong Algorithmic skill
Well versed in programming in Golang, Java or Python

What You’ll Need:

Strong in one or both of field i.e., Flink/ Spark ecosystem with Kubernetes ecosystem background
Strong analytical skills and with deep understanding of Distributed Systems
Strong programming skills in languages as Go, Python or Java
Understanding of Apache Spark ecosystem technologies (Flink operator or Spark Operator, Kafka, FluxCD, ArgoCD, Jenkin Pipelines etc)
Experience with large-scale business critical platforms with Flink/Spark on Kubernetes on data center or cloud.
Experience with continuous deployment on K8 with Helm, fluxCD. ArgoCD etc
Solid understanding of either Flink or Spark(data) and K8 storage systems (Object stores S3/Minio, Spark ephemeral storage, Persistent volumes mapping and claims)
Understanding or Flink streaming/spark memory management or experience with spark internals.
Familiarity with Chef is preferred.
Proven ability to work with both local and remote teams
Strong communication skills both verbal and written