Site Reliability Engineering Manager - Linux (Open to Remote)
Daxko powers health & wellness throughout the world. Every day our team members focus their passion and expertise in helping health & wellness facilities operate efficiently and engage their members.
Whether a neighborhood yoga studio, a national franchise with locations in every city, a YMCA or JCC-and every type of organization in between-we build solutions that make every aspect of running and being a member of a health and wellness organization easier and delightful.
The Site Reliability Engineering Manager manages all production assets for each product. This position s responsibilities include: batching, upgrading, deploying new servers, organizing the teams workload, supporting engineering efforts, compliance, uptime, and performance monitoring. You ll be responsible for prioritizing, organizing and leading your team on the execution of all work. The SRE Manager assesses operational capabilities and performance to ensure the on-time delivery of quality products and services to all customers, both internal and external.
The Site Reliability Engineering Manager reports to the VP of Platform Engineering.
Supervisory Responsibilities:
Sets and helps team understand performance targets and goals
Evaluates and provides real-time feedback on performance
Trains and/or ensures that team is properly trained for their specific roles
Coordinates on-call rotation
Coordinates training for staff
Assists in resolving emergencies, such as a infrastructure or software outages
Manages headcount and makes staffing decisions related to new hires and terminations
Your Responsibilities:
Oversee progress in achieving operational/production goals and objectives, especially with respect to quality, cost, and customer service
Take the responsibility for uptime, data accuracy, and integrity
Interact with Engineering Leads to assure alignment between teams
Maintain business continuity for all production assets
Ensure proper planning and prioritization using agile practices.
Ensure operations are in full compliance with all company and regulatory requirements
Be a technical escalation point for your team
Provide weekly reports on system availability, response, and capacity.
Manage on-call rotation among team members
< 5% Travel Required
Budget Responsibilities include ensuring fiscal responsibility for Hosting and Software licensing
Solid foundation in the following technologies:
Linux
Web Servers (NGiNX / PHP / Traefik / F5)
Virtualization Technologies (VMWare)
Cloud Platforms (AWS, Azure)
Containerization Systems (Docker, Kubernetes, Dynos)
Caching technology (Redis / rabbitmq )
Strong security mindset and experience implementing security controls
Excellent verbal and written communication skills.
Excellent interpersonal and customer service skills.
Excellent organizational skills and attention to detail.
Excellent time management skills with a proven ability to meet deadlines.
Strong analytical and problem-solving skills.
Strong supervisory and leadership skills.
Ability to prioritize tasks and to delegate them when appropriate.
Ability to function well in a high-paced and at times stressful environment.
Bachelor s degree - technical discipline preferred; OR equivalent experience
Three (3) to five (5) years of experience managing globally distributed team members
Three (3) to five (5) years of experience in a site reliability engineering capacity
Bonus Points for:
Bachelor s degree - technical discipline preferred
Four (4+) years of technical lead experience
Four (4) to five (5) years of experience in a site reliability engineering capacity
Strong observability experience with Monitoring Technologies, creating custom checks, and managing alert profiles and escalation policies. (OpenTelemetry, Instana, LogicMonitor, PagerDuty, OpsGenie)
Experience with Tooling (GitLab CI, Jenkins, Chef, Terraform, Elastic Search, Kubernetes, Rancher)
Scripting experience with the following languages: Ruby, Python, Bash
Experience with SOC, PCI, GDPR standards and regulations
Experience working tickets and managing priorities within issue tracking systems (Atlassian Suite, etc.)
Experience developing or supporting Java, php, or node applications
Experience automating repetitive tasks
Additional Information
Daxko is dedicated to pursuing and hiring a diverse workforce. We are committed to diversity in the broadest sense, including thought and perspective, age, ability, nationality, ethnicity, orientation, and gender. The skills, perspectives, ideas, and experiences of all of our team members contribute to the vitality and success of our purpose and values.
We truly care for our team members, and this is reflected through our offices, benefits, and great perks. These perks are only for our full time team members. Some of our favorites include:
Flexible paid time off
- Affordable health, dental, and vision insurance options
Monthly fitness reimbursement
401(k) matching
New-Parent Paid Leave
1-month paid sabbatical every 5 years
Casual work environments
Remote work
All your information will be kept confidential according to EEO guidelines.