logo

View all jobs

SRE DevOps Engineer

Sunnyvale, CA


One of our direct client is urgently looking for a SRE DevOps Engineer @  Sunnyvale, CA

TITLE:SRE DevOps Engineer
LOCATION: Sunnyvale, CA
Duration: 6 to 12+ Months 
Rate: DOE

Key Skills:
Splunk, Grafana, SRE, Cloud, DevOps, Azure, Docker, KuberNetes, Java (Basic), Python (Scripting)

Description:
  • You’ll sweep us off our feet if…
  • Needs to be able to dig into issues on our eCommerce site and identify root cause, and experience in
  • support and triage production incidents
  • Creating Dashboards, Alerting and Monitoring Subject Mater Expert
  • Experience with Application development and root cause analysis
  • Develops Innovation strategies, processes, automation, failover experience
  •  Drives the execution of multiple business plans and projects
  • Experience in driving high availability across multiple organizations
  • Experience in putting together architecture diagrams
  • Experience in managing workloads in private and public data centers
  • Infrastructure experience that involves, setup, scale, and decommissioning.
  • Prior cloud experience, planning and driving efficiencies
  •  Automation and CI/CD experience
  •  Application container experience using Kubernetes
  • Experience with event streaming platforms like Kafka is a plus
  •  Experience with analytics & monitoring platform like Grafana/graphite/MMS/Splunk is a plus

You’ll make an impact by:
  • Supporting java full stack backend application system components in a massively scalable, high performance, multi-tenant, international eCommerce platform with multiple micro-services deployed in cloud environment, root causing every reactive/proactive production issues.
  • Leads and participates in medium- to large-scale, complex, cross-functional projects
  • Partners with architects and development leads to come up with high level design to accelerateomnicustomer experience, recommending out-of-box engineering best practices.
  •  Pro-Actively identifies areas to drive automation/speed/innovation
  •  Troubleshoots business and production issues by gathering information (for example, issue, impact criticality, possible root cause); performing root cause analysis to reduce future issues; engaging support teams to assist in the resolution of issues; developing solutions; driving the development of an action plan; performing actions as designated in the plan; interpreting the results to determine furtheraction; and completing online documentation.
  • Provides support to the business by responding to user questions, concerns, and issues (for example, technical feasibility, implementation strategies); researching and identifying needed solutions determining implementation designs; providing guidance regarding implications of new and enhanced systems; identifying short and long term solutions; and directing users to appropriate contacts for issues outside of associate's domain.
  • Assists in providing guidance to small groups of 5 to 6 engineers, including offshore associates, for assigned Engineering projects by proving pertinent documents, directions, examples, and timeline.
  • Demonstrates up-to-date expertise and applies this to the development, execution, and improvement of action plans by providing expert advice and guidance to others in the application of information and best practices; supporting and aligning efforts to meet customer and business needs; and building commitment for perspectives and rationales.
  •  Models compliance with company policies and procedures and supports company mission, values, andstandards of ethics and integrity by incorporating these into the development and
  • implementation/Support of business plans; using the Open Door Policy; and demonstrating and assisting others with how to apply these in executing business processes and practices.
  • Provides and supports the implementation of business solutions by building relationships and partnerships with key stakeholders; identifying business needs; determining and carrying out necessaryprocesses and practices; monitoring progress and results; recognizing and capitalizing on improvementopportunities; and adapting to competing demands, organizational changes, and new responsibilities.

Minimum Qualifications:
  • Hands on experience debugging 5xx and 4xx
  • Java/Spring and Node/Python Experience is required
  • Creating database objects (tables, views, indexes)
  •  CI/CD experience automation and implementation experience
  • Kubernetes and Docker experience is a plus
  • Implement the database structure such as tables, indexes
  •  Reviewing and tuning the SQL scripts
  •  Reviewing database structure changes that provided by application developers and data modelers
  •  Working with application developers to tune the performance of the database
Ideal Candidate Must-Haves:  
  • Experience creating best in class application availability metrics and dashboards
  • Managing infrastructure scale, setup and decommissioning
  •  Public cloud experience, Azure, GCP and Private Data Centers
  • Driving P1 production incident calls, communicating up to the point & summarizing action plans foreach owners and follow-up until closure
  • Ability to take right priority decision and run the operational excellence with innovative ideas, without
  • much guidance/supervision
  • Ability to build and run tools necessary for operational success
  •  Documenting SOPs for repetitive issues, building knowledge base articles for team’s benefit

Share This Job

Powered by