Incident Management and Escalation Process

Modified on Mon, 24 Nov at 4:39 PM

Osher Digital uses a clear and predictable incident management process to ensure any disruption to your AI or automation system is handled quickly and professionally. This article explains how incidents are identified, how to report an issue and how escalation is managed.

How We Detect Incidents

We use proactive monitoring wherever possible. This includes:

  • Server resource monitoring

  • Container health checks

  • LLM provider status observation

  • API connectivity checks

  • Log review and alerting

If the system encounters an issue that can be detected programmatically, our team is notified automatically so we can take action without waiting for you to report it.

When Clients Should Report an Issue

Monitoring covers many scenarios, but not every issue can be detected automatically. For example:

  • Incorrect responses from an AI assistant

  • Missing documents in your knowledge base

  • Problems caused by changes in your internal systems

  • User access issues

  • Workflows not triggering as expected

  • Downtime

If you notice anything unusual, you should report it immediately.

How to report an incident

Please open a support ticket through the Osher Digital Help Centre as soon as possible. Include:

  • A clear description of the issue

  • The impact on your team

  • Any recent changes to your systems

  • Screenshots or examples where helpful

Opening a ticket ensures the issue is tracked, prioritised and escalated correctly.

Incident Severity Levels

Incidents are categorised based on impact and urgency.

Priority 1 (P1): Critical Impact

The system is unavailable, major functions are not working or business operations are severely affected.


Examples:

  • AI assistant or automation system is down

  • Hosting environment is unreachable

  • Critical workflow failure affecting production

Actions:

  • Immediate escalation

  • Rapid response and resolution

  • Frequent updates until resolved

Priority 2 (P2): Significant Impact

Important functionality is degraded but the system is still operating.


Examples:

  • Incorrect AI responses due to missing data

  • Workflow delays

  • Partial outage or intermittent behaviour

Actions:

  • Prompt investigation

  • Regular updates

  • Fix planned as soon as possible

Priority 3 (P3): Minor Impact

Non critical issues or improvements that do not affect core operations.


Examples:

  • UI glitches

  • Non urgent configuration changes

  • Minor inaccuracies or formatting issues

Actions:

  • Scheduled into maintenance or sprint forecast

How Escalation Works

Escalation ensures the right people are engaged quickly at every stage.

Step 1: Ticket Logged

The client reports the issue through the Help Centre.

Step 2: Initial Assessment

We determine severity and business impact.

Step 3: Response Assigned

The appropriate technical team member is allocated based on severity.

Step 4: Escalate if Required

For higher severity incidents, we escalate internally to senior engineers or leadership.

Step 5: Issue Resolved

Fixes may involve configuration updates, workflow adjustments, file re ingestions or infrastructure actions.

Step 6: Root Cause Review

Where relevant, we document the cause and preventive steps.

Communication During Incidents

Our communication during an incident is designed to keep you informed without creating noise.

  • P1 issues receive immediate updates.

  • P2 issues receive updates once new information is available.

  • P3 issues progress through scheduled maintenance.

You can request a summary report for any significant incident.

How Clients Can Help Speed Up Resolution

You can accelerate the incident process by providing:

  • Clear examples or screenshots

  • Details of any recent system or document changes

  • Information about how the issue affects workflows

  • Confirmation of who should be contacted for updates

Providing these details up front helps us solve the issue faster.

Our Commitment

Osher Digital is committed to maintaining secure and reliable systems. Our incident process ensures that issues are addressed quickly, escalated appropriately and resolved with minimal disruption to your operations.


If you experience any issues, please open a ticket immediately so we can start the incident process and restore full service as quickly as possible.

Was this article helpful?

That’s Great!

Thank you for your feedback

Sorry! We couldn't be helpful

Thank you for your feedback

Let us know how can we improve this article!

Select at least one of the reasons
CAPTCHA verification is required.

Feedback sent

We appreciate your effort and will try to fix the article