Close menu

Monitoring & Incidents Analyst

Full time

Job Description

Job Purpose
The Monitoring & Incidents Analyst will be part of the 24/7 Command Center Operations
team at Bank Muscat, responsible for proactively monitoring all IT services and infrastructure
using advanced observability and performance monitoring tools. The role ensures timely incident
detection, escalation, and resolution by coordinating with technical teams and vendors. The
analyst plays a key part in maintaining system uptime, managing critical incidents, and
supporting root cause analysis and incident trend reporting.

Key Tasks and Duties
Command Center Monitoring & Operations:
 Operate within the Command Center as part of a 24×7 rotating shift team.
 Monitor IT infrastructure and services using tools like Dynatrace, SolarWinds, Grafana,
Riverbed NPM, etc.
 Ensure all monitoring and observability tools function without disruption.
 Follow pre-defined escalation protocols for anomalies, alerts, or failures.
 Maintain shift logs and handover reports, ensuring accurate communication across
shifts.
 Execute runbooks and predefined recovery procedures when needed.
 Contact appropriate support and vendor teams in case of verified incidents.
Incident Management:
 Validate, assess, classify, and manage all IT-related incidents.
 Take ownership of Major Incidents, ensuring timely escalation and communication.
 Lead crisis calls and coordinate war-room activities.
 Provide regular incident updates to stakeholders, management, and end-users.
 Support incident resolution by working closely with L2/L3 teams, SMEs, and vendors.
 Track open issues and ensure adherence to SLA timelines for resolution.
 Ensure proper incident documentation in internal and vendor tracking systems.
 Participate in Root Cause Analysis (RCA) and implement corrective actions.

Reporting & Governance:
 Assist in preparing daily/weekly/monthly incident trend reports.
 Support historical data analysis from monitoring tools for problem and trend
identification.
 Participate in post-incident reviews and document lessons learned.
 Track progress of follow-ups and improvement actions resulting from incidents.
 Ensure compliance with audit, risk management, and change management policies.
Enhancements & Projects:
 Support enhancements to monitoring tools and observability systems.
 Collaborate with change/digital transformation teams for technology upgrades.
 Research and recommend modern monitoring strategies or tools.
 Maintain a knowledge library of SOPs, alerts, use cases, and response strategies.

Qualification
Bachelor’s degree in Information Technology or related field.

Related Jobs