Provides fully audited process where all incident data is captured in the ITSM system and operators have insight into the appropriate remediation action.
NETWORK DEVICE AUTO-REMEDIATION
Powered by StackStorm
Situation: A Network Interface Has Failed
- A network interface has gone down triggering actions to to discover why the interface failed.
- Was this a manual or intentional shutdown or was it due to some kind of testing underway or due to a critical failure of a network device?
- When the root cause of the network outage is unknown, IT staff will prioritize the problem as urgent and take actions accordingly.
- IT operations personnel are broadly activated to help remediate the situation even if the problem ultimately proves to be outside of their areas of responsibility.
The Conventional Workflow Approach
Manual Process: 2-4 Stressful Hours for Network Operations (see Figure 1 below).
A user recognizes that a service has gone down and creates a ServiceNow ticket. Service Ops assigns the ticket to NetOps with a Priority 1.
NetOps team members now manually start diagnostics, discovering that an interface is down. Remediation action is performed to bring up the interface.
NetOps team now either updates the ticket if successful or continues running diagnostics to discover why the interface went down, staying as a priority level 1 task. The ServiceNow record is then updated and closed.
Figure 1 – Manual Interface Outage Response
Orchestral.ai's Composer Solution
Orchestral.ai provides a completely automated solution for this problem. Orchestral Composer automatically executes new or existing remediation workflows that IT operations teams have developed while leveraging existing infrastructure tooling and management platforms that have been deployed.
Maestro + Composer: 20-40 Stress Free Seconds (see Figure 2 below).
Figure 2 – Maestro + Composer Automated Event Driven Network Remediation
Composer Event-Driven Auto-Remediation
- Composer informs the operations teams about the the interface outage through chatops or similar alerting/communications tool and indicates the start of an auto-remediation workflow.
- Composer collects state information on the router before and after the auto-remediation action.
- Composer zips the two "pre" and "post" state files as artifacts of the incident.
- If the interface has been successfully restored: Composer then opens a service ticket with priority "Low" on the ITSM ticketing system and attaches the troubleshooting artifacts for further analysis.
- If the interface has not been successfully restored: Composer opens a service ticket with priority "High" on the ITSM ticketing system and attaches the troubleshooting artifacts for further analysis.
- Lastly, Composer informs the operations team through chatops of the new incident created and the ticket number for follow up action.
Full incident auditing with automated IT ticket generation
Reduction of network downtime
Greatly reduces network downtime as the auto-remediation workflow executes in seconds as opposed to merely alerting and extending the downtime as manual operator intervention is activated.
Automatic prioritization of critical incidents
If the auto-remediation workflow is unsuccessful, the IT ticket can be automatically given a higher priority that triggers the appropriate response from the appropriate operations personnel.
Orchestral's solutions are available as free 30-day Proof of Value evaluations. To get started, just click the "FREE TRIAL" button at the top of this page and complete the Trial Request Form. If you'd like to see a demo first, just click the "Book a Demo" button below to book a date/time that works best for you. Otherwise, you can get started by emailing us at firstname.lastname@example.org.
Ready to see for yourself?
We'd love to show you how Orchestral.ai enables you to address a broad spectrum of orchestration & automation challenges.Book a Demo