Skip to main content

In today's interconnected digital landscape, the stability and continuous operation of IT infrastructure are paramount for business success. Organizations increasingly rely on a complex ecosystem of critical supporting technologies to maintain operations, secure data, and serve customers. Within this article, we will be outlining the steps organizations need to take to ensure consistent delivery of services for a SASE architecture.

 

Why do organizations need a Resiliency Plan for their Netskope deployment?

While Netskope maintains a robust high availability data plane network, allowing customers to connect seamlessly and securely from anywhere. Some Netskope services rely on the Management Plane (MP) which is where the primary tenant is deployed. In the unlikely event (such as a natural disaster) that a Management Plane losses the ability to provide service to customers, and the Netskope team is unable to restart the services within the original MP, customers who wish to maintain the availability of Netskope services will need to take action and as such might wish to develop a contingency plan to support the fail over to a secondary MP location.

 

What steps should Customers take to prepare for a MP failover?

Customers looking to create and maintain plans for an an MP Failover plan will need to do the following:

  • Determine a Secondary MP location, geographically dispersed from your primary MP
  • Provision users and groups through your IDP provider
  • Start with everything in the > Settings section of the tenant
    • Under Administration
      • Duplicate all of your Administrators and Roles
        • If using V2 API’s, ensure new API’s are created with the same name and permissions  as the established API’s to make changing any API calls as easy as just changing the URL
          • It is recommended to store all of these credentials in a vault if you aren’t doing so already
      • Labels
      • SSO
      • IP Allowlist
      • Privacy Notice
      • Internal Domains
      • Tracking, Audit Log, and CASB API Usage are not configurable

 

  • Under Security Cloud Platform duplicate the settings in “Configuration”
  • Depending on what products you utilize from the Netskope platform you will have more options available that you will need to replicate. 
    • Under TRAFFIC STEERING the options you will need to look at include:
      • Steering Configuration
      • App Definition
        • CLOUD & FIREWALL APPS
        • PRIVATE APPS
        • CERTIFICATE PINNED APPS
      • Publishers (Spin up publishers or create a plan to re-register existing publishers for NPA connectivity if using NPA)
        • At least one ‘live’ Publisher should be deployed and registered to the tenant in order to build out Realtime access policies
      • IPSec Site
      • GRE Site
      • Explicit Proxy
    • Under NETSKOPE CLIENT they are:
      • Client Configuration
      • Enforcement
      • SAML
      • MDM Distribution
      • Other items listed here are Users, Groups, and Devices. These will be populated automatically through the Directory/SCIM sync and when devices install the new client and sync up to the failover tenant
    • Under ENTERPRISE BROWSER:
      • Browser Setup
      • User Provisioning
        • Company Workspaces
        • Email Templates
    • Under REVERSE PROXY:
      • SAML
      • Office 365 Auth
      • Auth Integration
    • Under FORWARD PROXY:
      • SAML
    • ON PREMISES
      • On-Premises infrastructure
      • CDPP for Appliance
    • Under MAIL RELAY
      • SMTP
  • Under Risk Insights
    • Parsers - copy any Custom Parsers you have created.
  • Under Configure App Access:
    • Next Gen
      • Replicate any CASB API connections as well as SECURITY POSTURE connections
    • Classic
      • Replicate any SaaS and/or IaaS connections
  • Under Threat Protection
    • API Enabled Protection
      • Setup your Quarantine and Remediation profiles
    • Integration
      • Setup any Integrations you have already established. You may not be using this feature as this functionality is being moved to Cloud Threat Exchange
    • IPS Settings
      • ALLOW LIST
      • SIGNATURE OVERRIDES
    • Malware Retention
      • Configuration
      • Instances
  • Under Forensics
    • CONFIGURATION
    • INSTANCES
  • Under Manage
    • Device Classification
    • Advanced File Scanning
    • Multi-Factor Authentication Integration
    • Sensitivity Label Integration
    • Forward to Proxy Integration
    • Header Insertion
    • Certificates
      • TRUSTED CA
      • SIGNING CA
      • PRIVATE APP CERT
  • Under Tools
    • Templates
    • Directory Tools
      • On-Prem Integration
      • SCIM Integration
    • REST API v1 (If it is still in use)
    • REST API v2 (If you have not migrated them all to Administrators & Roles)
    • Event Streaming
    • Log Streaming
  • Once all of the ‘backend’ configurations are done, you are ready to build out the action taking items like the exceptions, and SSL bypasses that you have established in your primary MP
  • Unlike how we took a ‘top down’ approach for the Settings page, we are going to jump right into Policies > Profiles now. 
    • Under DLP, copy all custom:
      • DLP RULES
      • FINGERPRINT RULES
      • FILE CLASSIFIERS
    • Still under DLP, now that the above listed items are existing, go ahead and recreate the custom DLP Profiles
    • Under DNS
      • Copy and DNS Profiles
    • Under Threat Protection
      • MALWARE DETECTION PROFILES
      • REMEDIATION PROFILES
    • Skip over Custom Categories and start on URL Lists
      • Copy all URL Lists from the main tenant
    • Now go back to Custom Categories and recreate those from the main tenant
    • Under App Instance:
      • Create all App Instances that you reference in Policies
        • Other app instances will be dynamically created as the tenant sees traffic going to and from them
    • Under HTTP Header:
      • Copy any customer headers that have been created
    • Under Connected App/Plugin
      • Recreate the CONNECTED APP PLUGIN PROFILE
    • Under Domain, User, and File
      • Recreate all Profiles for each section
    • Under Constraint you will need to go through each section and recreate the Constraint Profiles
      • USERS
      • STORAGE
      • USB STORAGE DEVICE
      • PRINTER
      • NETWORK FILE SHARE
    • Under Quarantine Profile
      • Recreate all Classic profiles (if still in use)
      • Recreate all Next Gen profiles
        • Quarantine Profiles
        • Custom Tombstone Files
    • Under Legal Hold
      • Recreate all Classic profiles (if still in use)
      • Recreate all Next Gen profiles
    • Under Forensics Profile
      • Recreate all profiles
    • Under Network Location
      • Recreate all custom Network Locations
  • Now let’s go ahead and get the Policies > Templates section addressed
    • Upload all Custom Images before starting on the Notifications
    • Recreate all custom User Notifications
    • Recreate all custom Email Notifications
    • Recreate all RBI Templates
  • Let’s skip out to the App Catalog
    • Ensure the application tagging is matched to the primary tenant if you are leveraging policies and reporting based on the tagging
  • And now that we have all of those items defined, we can move on to the Policies themselves
    • SSL Decryption
    • Endpoint Protection
      • DEVICE CONTROL POLICIES - Important: Maintain the same order of the policies in both the primary and failover tenants
      • CONTENT CONTROL POLICIES - Important: Maintain the same order of the policies in both the primary and failover tenants
    • Real-time Protection
      • Recreate your Policy Groups
      • Replicate all policies in the same order as in the primary tenant
    • Enterprise Browser Protection
      • Browser Protection- Important: Maintain the same order of the policies in both the primary and failover tenants
      • Extension Governance
    • API Data Protection
      • SAAS
        • Classic
        • Next Gen
      • IAAS
    • IaaS Security Posture Management
      • PROFILES & RULES
        • Ensure all custom rules are recreated in the backup tenant before creating any Policies. This ensures that the policies are able to call all appropriate rules and avoids a mismatch on scans between the primary and failover tenants
      • POLICIES
    • SaaS Security Posture Management
      • Rules
        • Ensure all custom rules are recreated in the backup tenant before creating any Policies. This ensures that the policies are able to call all appropriate rules and avoids a mismatch on scans between the primary and failover tenants
      • Policies
    • Behavior Analytics
      • Enable / Disable policies to be like the primary tenant
      • Recreate any custom rules from the primary tenant
  • Now that we have these policies in place, lets move on to reporting
    • Advanced Analytics
      • Instruct any of the tenant Advanced Analytics users that have dashboards in their Personal folder to copy over anything they need so that it is available to them in the failover tenant
      • Copy over any dashboards in the Group folder
      • Replicate any Scheduled reporting from the primary tenant
      • The Netskope Library will be populated with the same dashboards in both tenants, so no action is required here
  • Digital Experience Management
    • Alerts - Recreate any Alerts that have been created in the primary tenant
    • Settings
      • Monitoring Sources
        • General Settings
          • Monitored Users and Stations - Match the settings from the primary tenant
          • Proactively Monitored Applications - Select the applications that require monitoring to match the primary tenant
      • Monitored Users and Stations
    • Topologies
      • Sites - Recreate sites from the primary tenant
      • Gateways - Recreate Gateways from the primary tenant
      • Both of these tasks can be accomplished by Exporting the Gateways and Sites from the primary tenant and Importing them into the failover tenant
    • Monitoring Policies
      • Network Probes
        • Netskope Client - Edit the Network Probe Default profile to match the primary tenant
        • Station - Recreate the Station(s) from the primary tenant
      • App Probes
        • Netskope Client - Recreate the App Probes from the primary tenant
        • Station - Recreate the App Probes from the primary tenant
      • Custom Applications
        • Netskope Client - Recreate the Custom Applications from the primary tenant
        • Stations - Recreate the Custom Applications from the primary tenant

 

  • Determine how your organization will maintain tenant symmetry between your primary and failover tenant
    • Change Control procedures can be modified in order to ensure policy replication takes place in the failover tenant as they are rolled out in the primary tenant
  • Establish a maximum acceptable outage time for services provided by Netskope that would trigger the decision to failover to the backup tenant

 

What steps do Customers need to take to take advantage of their Warm MP during a MP failure?

During an MP failure event customers should plan to take the following actions:

  • Redeploy and enroll new Netskope Clients connecting users to the secondary tenant
  • Ensure all the policies created before failover are activated.
  • Alert all users of the disruption 
  • Validate the failover by ensuring automations are functional and logs are being ingested in the correct manner from the failover tenant

 

While Netskope maintains the highest standard for our Management planes and High availability new edge, We recognize the need for our customers and partners to maintain a holistic and comprehensive Business continuity plan. Using this short guide, Organizations can plan for and restore Netskope services in a timely and efficient manner, in the event of a Management plan failure.


Further information regarding Netskope’s SLAs can be found on our Support terms page. For further questions, please reach out to GRC@netskope.com

Am I correct in assuming that this failover MP is a different tenant altogether?

If so, is this extra tenant something that Netskope provides for customers without extra cost?

 

To be blunt, this really just seems like “oh, if we have an issue with our management plane, it’s expected that our customers are ready to re-create their entire tenant from scratch”. 
I cannot see how that is a realistic expectation compared to competing products.
 


Hello Ryan,

Yes, the failover tenant is in a different Management Plane altogether in order to geographically separate the failover tenant from the primary tenant. This article was produced at the request of multiple customers wanting to explore options for increased resiliency in case a catastrophe, like a natural disaster, might occur that could bring down a Management Plane for an extended period of time. With recent service provider issues the like Crowdstrike incident in July 2024, and the more recent AWS outage, we are providing guidance on how companies can provide an additional layer to their BCP/DR plans. As referenced in the article, Netskope has an expected SLA uptime of >= 99.999% for inline services.

 


Here are the instructions in checklist form.