Principal Site Reliability Engineer
Company: Discover
Location: Providence
Posted on: March 16, 2023
|
|
Job Description:
Principal Site Reliability Engineer
Remote
R19086
About This Role
Discover. A brighter future.
With us, you'll do meaningful work from Day 1. Ourcollaborative
culture is built on three core behaviors: We Play to Win, We Get
Better Every Day & We Succeed Together.And we mean it - we want you
to grow and make a difference at one of the world's leading digital
banking and payments companies. We value what makes you unique so
that you have an opportunity to shine.
Come build your future, while being the reason millions of people
find a brighter financial future with Discover.
Job Description
At Discover, be part of a culture where diversity, teamwork, and
collaboration reign. Join a company that is just as employee
focused as it is on its customers and is consistently awarded for
both. We're all about people, and our employees are why Discover is
a great place to work. Be the reason we help millions of consumers
build a brighter financial future and achieve yours along the way
with a rewarding career.
As a Principal Site Reliability Engineer , you will be responsible
for the Site Reliability Function for Consumer Banking. A hands-on
Engineer and technical Subject Matter Expert, accountable for
effectively developing, tooling, and maturing Site Reliability
Engineering practices. Drive incident management best practices,
enforce appropriate quality and user experiences, drive out waste
and toil through automation, and foster a culture of continuous
innovation and learning. You'll tap into your passion for finding
and fixing inefficiencies to solve our reliability and performance
issues, with an emphasis on availability, latency, performance,
efficiency, change and problem management, monitoring, self-healing
automation, and capacity planning of our services
Responsibilities:
Work with a team of site reliability engineers that is responsible
for building the continuous reliability mindset, shepherding
problem management, and driving key site reliability engineering
practices into the organization.
Collaborate with other delivery teams in Consumer Banking to
identify and close process, application, and infrastructure gaps to
increase operational efficiency and platform stability.
Design and drive monitoring, alerting, ticket reporting strategies
to measure SLA, SLO, MTTI, MTTR. Etc. and align with management
expectations to reduce/minimize prod downtime.
Guide site reliability automation to help eliminate manual toil and
create a self-healing capability
Participate in selection of appropriate automation tools, defining
technology, quality, experience and implementation standards and
practices within own technical domain. Develops own technical
skills to attain Subject Matter Expertise in at least one technical
implementation within own technical domain. Ensures consistency of
technical execution and knowledge, sharing common practices and
challenges.
Fosters a culture of excellence and continuous learning within the
chapter. Establishes and tracks to appropriate OKRs to ensure
outcomes are met.
Utilizes engineering feedback and performance/satisfaction metrics
to identify areas of continuous improvement within Consumer
Banking. Engages with internal and external communities of practice
to share experiences, contribute knowledge, learn and advocate for
the Discover Technology brand. Promotes team innovation and
collaboration of ideas across teams.
Creates solutions addressing high impact technology and business
priorities
Competent in multiple contexts, such as programming languages,
security, automation, testing, infrastructure, performance, and
business domains and is the go-to person for many people (inside
and outside of their team)
Proactively identifies and mitigates issues based on intuition and
experience in multiple domains
Minimum Qualifications
At a minimum, here's what we need from you:
Bachelor's - Computer Science or related
6+ years of IT or related experience
Internal only Dreyfus rating of Proficient
Preferred Qualifications
If we had our say, we'd also look for:
10+ years of experience working in a mix of legacy and cloud
computing environment
4+ years of experience in automated software build & deployment
automation in distributed cloud environment
4+ years of experience in tech leadership role
4+ years of application development, architecture of infrastructure
experience
Knowledge of containerization (Kubernetes) platforms and Docker
Clear understanding of SRE best practices, performance management,
capacity analysis and creating fault tolerant deployment
patterns
Exposure to the Openshift Container Platform
Expertise and operational experience at scale - designing and
operating highly available, scalable, and fault-tolerant
systems
Experience with operational monitoring tools (AppDynamics,
NewRelic, Instana, CatchPoint) with a mindset towards predictive
analysis
Experience with Splunk or ELK Stack, Grafana, or DataDog
Knowledge of the automation tools such as Ansible, Terraform, or
Chef
Experience with Pivotal Cloud Foundry (PCF), OpenShift (OCP),
Amazon Web Service (AWS)
Experience with Continuous Integration and Continuous Delivery
models including Blue/Green and Canary release models is a plus
External applicants will be required to perform a technical
interview.
#BI-Remote
#Remote
#LI-JM3
What are you waiting for? Apply today!
All Discover employees place our customers at the very center of
our work. To deliver on our promises to our customers, each of us
contribute every day to a culture that values compliance and risk
management.
The same way we treat our employees is how we treat all applicants
- with respect. Discover Financial Services is an equal opportunity
employer(EEO is the law)
(https://www1.eeoc.gov/employers/poster.cfm) . We thrive on
diversity & inclusion. You will be treated fairly throughout our
recruiting process and without regard to race, color, religion,
sex, sexual orientation, gender identity, national origin,
disability, or veteran status in consideration for a career at
Discover.
Keywords: Discover, Providence , Principal Site Reliability Engineer, Professions , Providence, Rhode Island
Click
here to apply!
|