Incident Management and Escalation Process

Modified on Mon, 24 Nov, 2025 at 4:39 PM

Osher Digital uses a clear and predictable incident management process to ensure any disruption to your AI or automation system is handled quickly and professionally. This article explains how incidents are identified, how to report an issue and how escalation is managed.

How We Detect Incidents

We use proactive monitoring wherever possible. This includes:

Server resource monitoring
Container health checks
LLM provider status observation
API connectivity checks
Log review and alerting

If the system encounters an issue that can be detected programmatically, our team is notified automatically so we can take action without waiting for you to report it.

When Clients Should Report an Issue

Monitoring covers many scenarios, but not every issue can be detected automatically. For example:

Incorrect responses from an AI assistant
Missing documents in your knowledge base
Problems caused by changes in your internal systems
User access issues
Workflows not triggering as expected
Downtime

If you notice anything unusual, you should report it immediately.

How to report an incident

Please open a support ticket through the Osher Digital Help Centre as soon as possible. Include:

A clear description of the issue
The impact on your team
Any recent changes to your systems
Screenshots or examples where helpful

Opening a ticket ensures the issue is tracked, prioritised and escalated correctly.

Incident Severity Levels

Incidents are categorised based on impact and urgency.

Priority 1 (P1): Critical Impact

The system is unavailable, major functions are not working or business operations are severely affected.

Examples:

AI assistant or automation system is down
Hosting environment is unreachable
Critical workflow failure affecting production

Actions:

Immediate escalation
Rapid response and resolution
Frequent updates until resolved

Priority 2 (P2): Significant Impact

Important functionality is degraded but the system is still operating.

Examples:

Incorrect AI responses due to missing data
Workflow delays
Partial outage or intermittent behaviour

Actions:

Prompt investigation
Regular updates
Fix planned as soon as possible

Priority 3 (P3): Minor Impact

Non critical issues or improvements that do not affect core operations.

Examples:

UI glitches
Non urgent configuration changes
Minor inaccuracies or formatting issues

Actions:

Scheduled into maintenance or sprint forecast

How Escalation Works

Escalation ensures the right people are engaged quickly at every stage.

Step 1: Ticket Logged

The client reports the issue through the Help Centre.

Step 2: Initial Assessment

We determine severity and business impact.

Step 3: Response Assigned

The appropriate technical team member is allocated based on severity.

Step 4: Escalate if Required

For higher severity incidents, we escalate internally to senior engineers or leadership.

Step 5: Issue Resolved

Fixes may involve configuration updates, workflow adjustments, file re ingestions or infrastructure actions.

Step 6: Root Cause Review

Where relevant, we document the cause and preventive steps.

Communication During Incidents

Our communication during an incident is designed to keep you informed without creating noise.

P1 issues receive immediate updates.
P2 issues receive updates once new information is available.
P3 issues progress through scheduled maintenance.

You can request a summary report for any significant incident.

How Clients Can Help Speed Up Resolution

You can accelerate the incident process by providing:

Clear examples or screenshots
Details of any recent system or document changes
Information about how the issue affects workflows
Confirmation of who should be contacted for updates

Providing these details up front helps us solve the issue faster.

Our Commitment

Osher Digital is committed to maintaining secure and reliable systems. Our incident process ensures that issues are addressed quickly, escalated appropriately and resolved with minimal disruption to your operations.

If you experience any issues, please open a ticket immediately so we can start the incident process and restore full service as quickly as possible.