Lake Saint Louis,
Job Number: z5G7h3l6a1kMvyS65NP3c-Jkb7ECwVo52FkpPUR_3t4=
Pay Rate: 90000-100000 Yearly USD
Job Title: Sr System Reliability Engineer (Big Data)
Location: Remote, USA (Will need to travel for special meetings once or twice per month to St Louis, Client's HQ).
Job Type: Fulltime Permanent
Salary Range: $90000 to $100000/Annum
Linux, monitoring tools like: Splunk or Dynatrace, PL/Sql or oracle database, ITSM/ITIL fundamentals, big data tools like spark (preferred) or hive (preferred) or Hadoop.
Roles and Responsibilities:
- Plan, manage, and oversee all aspects of a Production Environment for Big Data Platforms.
- Define strategies for Application Performance Monitoring, Optimization in Prod environment.
- Respond to Incidents and improvise platform based on feedback and measure the reduction of incidents over time.
- Ensures that batch production scheduling and process are accurate and timely.
- Able to create and execute queries to big data platform and relational data tables to identify process issues or to perform mass updates, preferred.
- Performs ad hoc requests from users such as data research, file manipulation/transfer, research of process issues, etc.
- Take a holistic approach to problem solving, by connecting the dots during a production event through the various technology stack that makes up the platform, to optimize mean time to recover.
- Engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation, and refinement.
- Analyze ITSM activities of the platform and provide feedback loop to development teams on operational gaps or resiliency concerns.
- Support services before they go live through activities such as system design consulting, capacity planning and launch reviews.
- Support the application CI/CD pipeline for promoting software into higher environments through validation and operational gating, and lead Mastercard in DevOps automation and best practices.
- Maintain services once they are live by measuring and monitoring availability, latency, and overall system health.
- Scale systems sustainably through mechanisms like automation and evolve systems by pushing for changes that improve reliability and velocity.
- Work with a global team spread across tech hubs in multiple geographies and time zones.
- Ability to share knowledge and explain processes and procedures to others.
- Experience in the Big Data technologies (Hadoop, Spark, Nifi, Impala)
- Experience with performing data analysis, data observability, data ingestion and data integration.
- 5+ years of relevant data engineering, data infrastructure, DataOps, DevOps, SRE, or general systems engineering experience.
- 5+ years of Experience in running Big Data production systems.
- 1+ years of Hands-on experience in industry standard CI/CD tools like Git/BitBucket, Jenkins, Maven, Artifactory, and Chef.
- Experience architecting and implementing data governance processes and tooling (such as data catalogs, lineage tools, role-based access control)
- Solid grasp of SQL fundamentals
- Experience with algorithms, data structures, scripting, pipeline management, and software design.
- Systematic problem-solving approach, coupled with strong communication skills and a sense of ownership and drive.
- Ability to help debug and optimize code and automate routine tasks.
- Ability to support many different stakeholders. Experience in dealing with difficult situations and making decisions with a sense of urgency is needed.
- Appetite for change and pushing the boundaries of what can be done with automation.
- Experience in working across development, operations, and product teams to prioritize needs and to build relationships is a must.
- Experience designing and implementing an effective and efficient CI/CD flow that gets code from dev to prod with high quality and minimal manual effort is desired.
- Good Handle on Change Management and Release Management aspects of Software.