joblet.ai
Find JobsNearby JobsJobs for you
Sign inEmployers / Post a Job
joblet.ai

AI-powered job search connecting talent with opportunity.

ELEVEN AI, Inc.
200 Continental Drive, Suite 401
Newark, DE 19713

Product

  • Browse Jobs
  • Job Locations
  • Browse by Companies
  • Post a Job
  • Blog
  • FAQ
  • Jobs Near Me

Company

  • About Us
  • Contact
  • Refer & Earn
  • Explore all pages

Legal

  • Privacy Policy
  • Cookie Policy
  • Terms of Service

Browse jobs by industry

  • AI
  • IT Services
  • Healthcare
  • Manufacturing & Production
  • Supply Chain
  • Infrastructure
  • Transport & Logistics
  • Real Estate
  • Finance & Accounting
  • Consulting
  • Sales & Marketing
  • Hospitality
  • Media & Entertainment
  • Education

© 2026 ELEVEN AI, Inc. joblet.ai is a product of ELEVEN AI, Inc. All rights reserved.

Overview

Company
Xactus
Location
all cities, DC 8
Employment type
On-site
  • Driver Recruiter (8)
  • Telecoms Engineer (8)
  • Partner Business Director (8)
  • Customer Service Agent (Work From Home, Bogota) (8)
  • GTM Lead, Payments & Embedded Finance (8)
  • Remote Talent Acquisition Consultant - Healthcare Network Specialist (43)
Back to Jobs
X
XactusVerified Employer

Business Services & Consulting • all cities, DC 8

Director of Platform Operations & Reliability (8)

all cities, DC 8On-sitePosted 1 day ago
Business Services & Consulting

About the Role

WHO WE ARE:

Xactus (pronounced 'Zac-tus') is the leading verification innovator for the mortgage industry. We have over 6,500 clients ranging from the largest bank and non-bank mortgage originators to credit unions and mortgage brokers. Xactus works closely with our clientsto digitally integrate a 360° approach to verification across their workflows. As a result, lenders can easily access the technology necessary to meet consumer demands for a modern mortgage experience with industry-leading speed, reliability, and accuracy - while also closing more loans more quickly with greater profitability.

Xactus is proud to provide a friendly work environment that is primarily remote. Our workforce provides many opportunities for you to enhance your skills with our top-notch financial leadership team who prioritizes building talent. If you are curious and searching for a change that will be fun and rewarding, please reach out to us! We would love to have you on our team!

WHO YOU ARE:

You are a career-minded, driven individual who is looking for a position that challenges you and supports your professional development.

THE BENEFITS WE OFFER:

A friendly, supportive environment which is highly rated by Xactus employees. Feedback from our employees says: "The people I work with treat each other with respect," "I feel accepted by my coworkers," and "The person I report to cares about me as a person."

Xactus offers medical, vision and dental insurances, bonus programs, fitness reimbursement and other healthy life-style programs through our benefits carrier, 401k plan with a company match, short and long-term disability, life insurance, accident and critical illness insurance, health savings account, flexible spending account, employee assistance program, legal services, employee discounts and more.

SUMMARY:

The Director of Platform Operations & Reliability is a critical leadership role responsible for the health, stability, and continuous improvement of Xactus' client-facing platforms. This leader owns four interconnected operational pillars: platform health and reliability, 3rd party technical integration relationships, incident management maturity, and observability and alerting advancement. Underpinning all three is a fourth responsibility: maintaining proactive awareness of third-party integration dependencies, ensuring that certificate expirations, credential rotations, IP changes, and endpoint updates are tracked, communicated, and acted upon before they cause disruption.

The ideal candidate is operationally rigorous, automation-obsessed, and thrives in a high-velocity agile environment operating on two-week sprint cadences. Equally important is the ability to communicate the value of operational investments - translating monitoring capabilities, process improvements, and reliability work into business outcomes for Tech and Product leadership. This role manages an Incident Management Specialist and a Site Reliability Engineer, with the opportunity to grow the team over time.

ESSENTIAL FUNCTIONS:
The following is a list of essential functions, which is subject to change at any time and without advance notice. Management may assign new duties, reassign existing duties, or eliminate a function based on business needs or at its sole discretion.

ESSENTIAL DUTIES AND RESPONSIBILITIES:

Pillar 1: Platform Health & Reliability
  • Own the operational health of all client-facing Xactus platforms - the primary lens through which this role operates.
  • Monitor platform performance, availability, and capacity; identify degradation trends before they become incidents.
  • Partner with Engineering and DevOps to drive improvements to system resiliency, deployment stability, and recovery capabilities.
  • Define and track SLIs, SLOs, and error budgets for core platform services; report regularly to Tech and Product leadership.
  • Maintain and continuously improve runbooks, operational playbooks, and disaster recovery procedures.
  • Champion a culture of reliability and platform ownership across engineering teams.
Pillar 2: Third-Party Integration Vigilance
  • Maintain proactive awareness of all third-party integration dependencies that support client-facing platform delivery.
  • Own a living inventory of integration touchpoints, including:
    • TLS/SSL certificate expiration dates and renewal schedules
    • API credential rotation schedules and secrets hygiene
    • IP allowlist dependencies and endpoint change notifications
    • Partner-driven deprecations, version upgrades, or connectivity changes
  • Ensure appropriate action is taken - and the right internal teams are engaged - whenever integration parameters change.
  • Serve as the escalation point with third-party partners during integration-related incidents or degradations.
  • Automate integration health checks, credential rotation reminders, and certificate expiry alerts wherever possible.
Pillar 3: Incident Management Maturation
  • Own and continuously mature Xactus' end-to-end incident management process - from detection through post-incident review.
  • Lead war-room coordination during major incidents; maintain composure, decisiveness, and clear stakeholder communication under pressure.
  • Define and enforce incident severity classifications, escalation paths, on-call rotations, and response SLAs.
  • Drive post-incident reviews and root cause analysis (RCA) with measurable corrective action follow-through.
  • Automate incident workflows: alert routing, on-call paging, runbook execution, and stakeholder notifications.
  • Produce regular incident trend reporting for engineering and product leadership; surface systemic reliability risks proactively.
Pillar 4: Monitoring, Alerting & Observability Maturity
  • Own the Xactus observability platform (Datadog and related tooling) - configuration, governance, and roadmap.
  • Lead the maturation of monitoring and alerting capabilities; identify coverage gaps and drive improvements through the sprint backlog.
  • Build and maintain platform health dashboards and automated reporting for engineering and executive audiences.
  • Advocate for observability investments with Tech and Product leadership - clearly communicating how new alerting and monitoring capabilities reduce business risk and improve MTTR.
  • Develop the ability to prioritize high-value monitoring capabilities within a competitive feature backlog, making and defending priority calls with limited shared resources.
  • Automate alerting tuning, noise reduction, and reporting pipelines to reduce manual toil and improve signal quality.
  • Embed observability best practices into the SDLC in partnership with DevOps and engineering teams.
Leadership & Cross-Functional Collaboration
  • Directly manage an Incident Management Specialist and a Site Reliability Engineer; provide mentorship, day-to-day leadership, and performance development.
  • Build a high-performing operations team culture with a strong bias toward automation, continuous improvement, and shared accountability.
  • Operate effectively within a two-week sprint cadence; own a prioritized backlog of operational capabilities and advocate for resourcing against a competitive feature backlog.
  • Communicate clearly and confidently across all organizational levels - from engineers to C-suite - translating operational complexity into clear, business-relevant narratives.
  • Partner with Engineering, Product, DevOps, and Security to align operational priorities with platform delivery goals.
QUALIFICATIONS:

To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required.

Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

EDUCATION AND/OR EXPERIENCE:
  • Bachelor's degree in Information Technology, Computer Science, or a related field; equivalent experience considered.
  • 5+ years of experience in technology operations, incident management, SRE, or a related discipline.
  • Proven track record managing client-facing platform reliability and large-scale incident response.
  • Hands-on proficiency with Datadog (required) - dashboards, monitors, APM, log management, and alerting configuration.
  • Experience managing third-party integration dependencies, including certificate and credential lifecycle awareness.
  • Strong automation mindset; experience scripting or building operational tooling to eliminate manual processes.
  • Demonstrated ability to prioritize within a dense, competing operational and feature backlog while maintaining business alignment.
  • Exceptional communication skills - able to articulate the value of operational investments to non-technical product and business stakeholders.
  • Comfortable operating with urgency and clarity in a high-velocity, agile environment with two-week sprint cadences.
Preferred Qualifications
  • Experience with cloud platforms (AWS, Azure) and cloud-native observability patterns.
  • Familiarity with incident management frameworks (ITIL, PagerDuty, Opsgenie, StatusPage, or similar).
  • Prior experience in a SaaS, fintech, or data-intensive technology company.
  • Experience with secrets management tooling (Keeper, HashiCorp Vault, AWS Secrets Manager, or similar).
  • Familiarity with infrastructure-as-code or configuration management tooling (Terraform, Ansible, etc.).
SKILLS AND COMPETENTCIES:
  • Platform-first ownership: deep care for the reliability and health of client-facing systems
  • Proactive vigilance: identifies risk before it becomes an incident - across platforms and integrations alike
  • Automation-first mindset: defaults to automating repeatable operational work
  • Communication excellence: persuasive and calibrated to audience - from Teams threads to exec briefings
  • Backlog prioritization: skilled at making and defending priority calls in a resource-constrained environment
  • Technical credibility: engages deeply with engineers while translating complexity for leadership
  • Operational rigor: disciplined about documentation, process, and follow-through
  • Collaborative leadership: builds trust across engineering, product, and operations boundaries
  • Resilience and adaptability: thrives in ambiguity and rapid change
WORKING CONDITIONS:
  • Traditional office environment with low-to-moderate office noise (computers, phones and business conversations). The position may be remote from main offices.
  • Work from home.
  • May require on-call availability during major platform incidents.
  • May require occasional travel.
  • Flexibility in hours may be required given the 24/7 nature of platform operations.
PHYSICAL DEMANDS:
Xactus promotes an equal opportunity workplace, which includes reasonable accommodations of otherwise qualified disabled applicants and team members. Please contact your supervisor with questions regarding the physical demands of this position.
  • Lifting/carrying up to 10 lbs.
  • Manual dexterity for computer work
  • Speaking, hearing and vision are required to perform essential functions.

ADDITIONAL INFORMATION REGARDING EMPLOYMENT WITH XACTUS:

If applying for a position at Xactus, candidates must be a resident of the United States, and should be authorized and eligible for employment in the United States. Xactus does not provide visa sponsorship at this time.

If hired to work at Xactus, employment is contingent upon successful completion of required background checks (including but not limited to federal and state criminal history checks, employment history verification, education verification, credit history check) and pre-employment drug screening.

No agency submissions, please. We do not accept unsolicited resumes from third-party recruiters.

Equal Opportunity Employer
This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights notice from the Department of Labor.
WHO WE ARE:

Xactus (pronounced 'Zac-tus') is the leading verification innovator for the mortgage industry. We have over 6,500 clients ranging from the largest bank and non-bank mortgage originators to credit unions and mortgage brokers. Xactus works closely with our clientsto digitally integrate a 360° approach to verification across their workflows. As a result, lenders can easily access the technology necessary to meet consumer demands for a modern mortgage experience with industry-leading speed, reliability, and accuracy - while also closing more loans more quickly with greater profitability.

Xactus is proud to provide a friendly work environment that is primarily remote. Our workforce provides many opportunities for you to enhance your skills with our top-notch financial leadership team who prioritizes building talent. If you are curious and searching for a change that will be fun and rewarding, please reach out to us! We would love to have you on our team!

WHO YOU ARE:

You are a career-minded, driven individual who is looking for a position that challenges you and supports your professional development.

THE BENEFITS WE OFFER:

A friendly, supportive environment which is highly rated by Xactus employees. Feedback from our employees says: "The people I work with treat each other with respect," "I feel accepted by my coworkers," and "The person I report to cares about me as a person."

Xactus offers medical, vision and dental insurances, bonus programs, fitness reimbursement and other healthy life-style programs through our benefits carrier, 401k plan with a company match, short and long-term disability, life insurance, accident and critical illness insurance, health savings account, flexible spending account, employee assistance program, legal services, employee discounts and more.

SUMMARY:

The Director of Platform Operations & Reliability is a critical leadership role responsible for the health, stability, and continuous improvement of Xactus' client-facing platforms. This leader owns four interconnected operational pillars: platform health and reliability, 3rd party technical integration relationships, incident management maturity, and observability and alerting advancement. Underpinning all three is a fourth responsibility: maintaining proactive awareness of third-party integration dependencies, ensuring that certificate expirations, credential rotations, IP changes, and endpoint updates are tracked, communicated, and acted upon before they cause disruption.

The ideal candidate is operationally rigorous, automation-obsessed, and thrives in a high-velocity agile environment operating on two-week sprint cadences. Equally important is the ability to communicate the value of operational investments - translating monitoring capabilities, process improvements, and reliability work into business outcomes for Tech and Product leadership. This role manages an Incident Management Specialist and a Site Reliability Engineer, with the opportunity to grow the team over time.

ESSENTIAL FUNCTIONS:
The following is a list of essential functions, which is subject to change at any time and without advance notice. Management may assign new duties, reassign existing duties, or eliminate a function based on business needs or at its sole discretion.

ESSENTIAL DUTIES AND RESPONSIBILITIES:

Pillar 1: Platform Health & Reliability
  • Own the operational health of all client-facing Xactus platforms - the primary lens through which this role operates.
  • Monitor platform performance, availability, and capacity; identify degradation trends before they become incidents.
  • Partner with Engineering and DevOps to drive improvements to system resiliency, deployment stability, and recovery capabilities.
  • Define and track SLIs, SLOs, and error budgets for core platform services; report regularly to Tech and Product leadership.
  • Maintain and continuously improve runbooks, operational playbooks, and disaster recovery procedures.
  • Champion a culture of reliability and platform ownership across engineering teams.
Pillar 2: Third-Party Integration Vigilance
  • Maintain proactive awareness of all third-party integration dependencies that support client-facing platform delivery.
  • Own a living inventory of integration touchpoints, including:
    • TLS/SSL certificate expiration dates and renewal schedules
    • API credential rotation schedules and secrets hygiene
    • IP allowlist dependencies and endpoint change notifications
    • Partner-driven deprecations, version upgrades, or connectivity changes
  • Ensure appropriate action is taken - and the right internal teams are engaged - whenever integration parameters change.
  • Serve as the escalation point with third-party partners during integration-related incidents or degradations.
  • Automate integration health checks, credential rotation reminders, and certificate expiry alerts wherever possible.
Pillar 3: Incident Management Maturation
  • Own and continuously mature Xactus' end-to-end incident management process - from detection through post-incident review.
  • Lead war-room coordination during major incidents; maintain composure, decisiveness, and clear stakeholder communication under pressure.
  • Define and enforce incident severity classifications, escalation paths, on-call rotations, and response SLAs.
  • Drive post-incident reviews and root cause analysis (RCA) with measurable corrective action follow-through.
  • Automate incident workflows: alert routing, on-call paging, runbook execution, and stakeholder notifications.
  • Produce regular incident trend reporting for engineering and product leadership; surface systemic reliability risks proactively.
Pillar 4: Monitoring, Alerting & Observability Maturity
  • Own the Xactus observability platform (Datadog and related tooling) - configuration, governance, and roadmap.
  • Lead the maturation of monitoring and alerting capabilities; identify coverage gaps and drive improvements through the sprint backlog.
  • Build and maintain platform health dashboards and automated reporting for engineering and executive audiences.
  • Advocate for observability investments with Tech and Product leadership - clearly communicating how new alerting and monitoring capabilities reduce business risk and improve MTTR.
  • Develop the ability to prioritize high-value monitoring capabilities within a competitive feature backlog, making and defending priority calls with limited shared resources.
  • Automate alerting tuning, noise reduction, and reporting pipelines to reduce manual toil and improve signal quality.
  • Embed observability best practices into the SDLC in partnership with DevOps and engineering teams.
Leadership & Cross-Functional Collaboration
  • Directly manage an Incident Management Specialist and a Site Reliability Engineer; provide mentorship, day-to-day leadership, and performance development.
  • Build a high-performing operations team culture with a strong bias toward automation, continuous improvement, and shared accountability.
  • Operate effectively within a two-week sprint cadence; own a prioritized backlog of operational capabilities and advocate for resourcing against a competitive feature backlog.
  • Communicate clearly and confidently across all organizational levels - from engineers to C-suite - translating operational complexity into clear, business-relevant narratives.
  • Partner with Engineering, Product, DevOps, and Security to align operational priorities with platform delivery goals.
QUALIFICATIONS:

To perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the knowledge, skill, and/or ability required.

Reasonable accommodations may be made to enable individuals with disabilities to perform the essential functions.

EDUCATION AND/OR EXPERIENCE:
  • Bachelor's degree in Information Technology, Computer Science, or a related field; equivalent experience considered.
  • 5+ years of experience in technology operations, incident management, SRE, or a related discipline.
  • Proven track record managing client-facing platform reliability and large-scale incident response.
  • Hands-on proficiency with Datadog (required) - dashboards, monitors, APM, log management, and alerting configuration.
  • Experience managing third-party integration dependencies, including certificate and credential lifecycle awareness.
  • Strong automation mindset; experience scripting or building operational tooling to eliminate manual processes.
  • Demonstrated ability to prioritize within a dense, competing operational and feature backlog while maintaining business alignment.
  • Exceptional communication skills - able to articulate the value of operational investments to non-technical product and business stakeholders.
  • Comfortable operating with urgency and clarity in a high-velocity, agile environment with two-week sprint cadences.
Preferred Qualifications
  • Experience with cloud platforms (AWS, Azure) and cloud-native observability patterns.
  • Familiarity with incident management frameworks (ITIL, PagerDuty, Opsgenie, StatusPage, or similar).
  • Prior experience in a SaaS, fintech, or data-intensive technology company.
  • Experience with secrets management tooling (Keeper, HashiCorp Vault, AWS Secrets Manager, or similar).
  • Familiarity with infrastructure-as-code or configuration management tooling (Terraform, Ansible, etc.).
SKILLS AND COMPETENTCIES:
  • Platform-first ownership: deep care for the reliability and health of client-facing systems
  • Proactive vigilance: identifies risk before it becomes an incident - across platforms and integrations alike
  • Automation-first mindset: defaults to automating repeatable operational work
  • Communication excellence: persuasive and calibrated to audience - from Teams threads to exec briefings
  • Backlog prioritization: skilled at making and defending priority calls in a resource-constrained environment
  • Technical credibility: engages deeply with engineers while translating complexity for leadership
  • Operational rigor: disciplined about documentation, process, and follow-through
  • Collaborative leadership: builds trust across engineering, product, and operations boundaries
  • Resilience and adaptability: thrives in ambiguity and rapid change
WORKING CONDITIONS:
  • Traditional office environment with low-to-moderate office noise (computers, phones and business conversations). The position may be remote from main offices.
  • Work from home.
  • May require on-call availability during major platform incidents.
  • May require occasional travel.
  • Flexibility in hours may be required given the 24/7 nature of platform operations.
PHYSICAL DEMANDS:
Xactus promotes an equal opportunity workplace, which includes reasonable accommodations of otherwise qualified disabled applicants and team members. Please contact your supervisor with questions regarding the physical demands of this position.
  • Lifting/carrying up to 10 lbs.
  • Manual dexterity for computer work
  • Speaking, hearing and vision are required to perform essential functions.

ADDITIONAL INFORMATION REGARDING EMPLOYMENT WITH XACTUS:

If applying for a position at Xactus, candidates must be a resident of the United States, and should be authorized and eligible for employment in the United States. Xactus does not provide visa sponsorship at this time.

If hired to work at Xactus, employment is contingent upon successful completion of required background checks (including but not limited to federal and state criminal history checks, employment history verification, education verification, credit history check) and pre-employment drug screening.

No agency submissions, please. We do not accept unsolicited resumes from third-party recruiters.

Equal Opportunity Employer
This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights notice from the Department of Labor.

What You'll Do

Own the operational health of all client-facing Xactus platforms - the primary lens through which this role operates.
Monitor platform performance, availability, and capacity; identify degradation trends before they become incidents.
Partner with Engineering and DevOps to drive improvements to system resiliency, deployment stability, and recovery capabilities.
Define and track SLIs, SLOs, and error budgets for core platform services; report regularly to Tech and Product leadership.
Maintain and continuously improve runbooks, operational playbooks, and disaster recovery procedures.
Champion a culture of reliability and platform ownership across engineering teams.

Skills & Technologies

Business Services & Consulting

Similar jobs

Driver Recruiter (8)
Giltner Logistics
all cities, DC 8Posted 6 days ago
Telecoms Engineer (8)
Bechtel Corporation
all cities, DC 8Posted 1 day ago
Partner Business Director (8)
Coupa
all cities, DC 8Posted 1 day ago
Customer Service Agent (Work From Home, Bogota) (8)
Sezzle
all cities, DC 8Posted 1 day ago
GTM Lead, Payments & Embedded Finance (8)
One
all cities, DC 8Posted 12 days ago
Remote Talent Acquisition Consultant - Healthcare Network Specialist (43)
MLee Medical Employment
all cities, TN 43Posted 6 days ago
X
Xactus
Business Services & Consulting
View all jobs at Xactus