Staff Site Reliability Engineer, Cloud Efficiency
at Okta
Bengaluru, India
Get to know Okta
Okta is The World’s Identity Company. We free everyone to safely use any technology, anywhere, on any device or app. Our flexible and neutral products, Okta Platform and Auth0 Platform, provide secure access, authentication, and automation, placing identity at the core of business security and growth.
At Okta, we celebrate a variety of perspectives and experiences. We are not looking for someone who checks every single box - we’re looking for lifelong learners and people who can make us better with their unique experiences.
Join our team! We’re building a world where Identity belongs to you.
We are looking for a Staff Site Reliability Engineer to join Okta’s Infrastructure Platform FinOps team. FinOps team creates tooling and workflows that activate Cloud Cost Optimization, Discount management, Project-costing and Financial accountability, helping the organization make informed decisions around cloud spending. Sitting inside the SRE organization, our work directly supports the company’s efficiency and scalability goals by helping align engineering and finance teams on responsible cloud usage. You will act as technical lead for a small team of junior Engineers focused on accelerating our FinOps and other related best practices for the SRE organization. The ideal candidate is someone who exemplifies the ethics of, “If you have to do something more than once, automate it” and who can rapidly self-educate on new concepts and tools.
What you’ll be doing?
- Work within and be technical Lead for a specialized SRE team designing, building, running, and monitoring FinOps tooling that helps project-cost, measure, estimate, tag and cost-optimize our global production infrastructure.
- Design and implement FinOps tooling & pipelines that collect, analyze and present usage and cost data from infrastructure in a variety of instance- and container-based architectures such as EC2, ECS and EKS across multiple environments.
- Continuously evolving and maintaining our Cost monitoring tools and platforms, leveraging full-stack technologies such as Python, Bash, NodeJS or React/Angular, Nginx, AppSync, S3, IAM, EC2/Fargate, Lambda, DynamoDB, Glue, Athena - a fullstack developer for critical internal tooling.
- Deploy FinOps tooling and policies into containerized environments using Terraform, Cloud Custodian (c7n), CloudFormation
- Leverage and manage data-pipelines for our CUR data into analytics tooling such as Quicksights and Tableau
- Setup of Budgeting and Alerting using SQS, SNS from AWS Cost Explorer and other self-managed tooling, including GHG inventory management, and equivalent in GCP or other Cloud infrastructure environments
- Be an evangelist for FinOps best practices and also lead initiatives/projects to strengthen our cost optimization posture for critical infrastructure.
- Developing and maintaining technical documentation, runbooks, procedures and help develop Training and FinOps updates for the broader Engineering team
- Be a technical SME on cost optimization and FinOps practices for a broader Engineering team that designs and builds Okta's production infrastructure, focusing on security at scale in the cloud.
What you’ll bring to the role?
- Are always willing to go the extra mile: see a problem, fix the problem.
- Are passionate about encouraging the development of engineering peers and leading by example.
- Have experience with either AWS and/or GCP Cloud environments.
- Have an understanding and familiarity with configuration management tools like Terraform, Cloudformation, also Cloud Custodian for this role
- Have expert-level abilities in operational tooling languages such as Python, Go, Bash for back-end, NodeJS for front-end, and use of source control systems
- Have knowledge of various types of data stores and data pipelines, particularly DynamoDB, Athena, Glue.
- Experience with industry-standard FinOps tooling such as Cloud Custodian, OpenCost/KubeCost
- Have knowledge of CI/CD principles, Linux fundamentals, OS hardening, networking concepts, and IP protocols.
- Skilled in using Cloudwatch, Grafana, Splunk for real-time monitoring and proactive incident detection
- Ability to lead small technical teams, along with a strong ability to collaborate with cross-functional teams and promote a cost-aware engineering culture.
Experience in the following
- 8+ years of experience running and managing complex AWS or other cloud networking infrastructure resources including architecture, security, scalability and cost optimization
- 8+ years Coding/Automation experience using Python, Bash, NodeJS
- 5+ years of experience with Terraform or other Infrastructure as Code
- 5+ years of experience in automating CI/CD pipelines using tooling such as Spinnaker, ArgoCD or general GitOps with an emphasis on integrating security throughout the process.
- 4+ years Kubernetes, ECS or related Container-management experience
- Exposure in implementing monitoring and observability solutions such as Grafana or Splunk to enhance security and detect incidents in real-time.
- Strong leadership and collaboration skills with experience working cross-functionally with site reliability engineers and developers to encourage cost-optimization best practices and policies.
- Strong Linux understanding and experience.
- Strong security background and knowledge.
- BS In computer science (or equivalent experience).
#LI-Hybrid
P8568_3239036
What you can look forward to as a Full-Time Okta employee!
- Amazing Benefits
- Making Social Impact
- Developing Talent and Fostering Connection + Community at Okta
Okta cultivates a dynamic work environment, providing the best tools, technology and benefits to empower our employees to work productively in a setting that best and uniquely suits their needs. Each organization is unique in the degree of flexibility and mobility in which they work so that all employees are enabled to be their most creative and successful versions of themselves, regardless of where they live. Find your place at Okta today! https://www.okta.com/company/careers/.
Some roles may require travel to one of our office locations for in-person onboarding.
Okta is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, marital status, age, physical or mental disability, or status as a protected veteran. We also consider for employment qualified applicants with arrest and convictions records, consistent with applicable laws.
If reasonable accommodation is needed to complete any part of the job application, interview process, or onboarding please use this Form to request an accommodation.
Okta is committed to complying with applicable data privacy and security laws and regulations. For more information, please see our Personnel and Job Candidate Privacy Notice at https://www.okta.com/legal/personnel-policy/.