One of the most common questions I get from administrators is how does Netskope determine the optimal Data Center (DC) when steering inline traffic. I’d like to simply tell you that the client and other Netskope services are intelligent enough to steer traffic to the optimal DC. While on the surface that’s true, there is some nuance to it. Before we dive in, let’s take a high-level look at Netskope’s architecture.
NewEdge is Netskope’s underlying network that powers all of our inline services. There are NewEdge DCs across the globe in more than 60 regions in cities such as London, Miami, San Francisco, Hong Kong, Delhi, Dubai, and numerous others. You can look at our publicly available Trust Portal to see the most up to date list of active DCs. By default, the client and other Netskope services can utilize any of these DCs for services such as SWG, DLP, content filtering, inline CASB, and more. While this is the default tenant configuration, administrators can also request more specific traffic processing zones for highly regulated environments or countries with data processing requirements (“data-in-motion” compliance). For example, if you require that data processing occurs in the European Union, we can limit your tenant to only route traffic through the European Union DCs. In general though, we want our users to connect to the closest data plane for optimal performance regardless of what zone they are in, while preferring in-region/country DCs. This article focuses on the client but other Netskope services such as Cloud Explicit Proxy and the SMTP proxy also have specific mechanisms to provide optimal DC selection. For now though, let’s focus on the Netskope steering client. Keep in mind, you can use the log entries from the examples below to troubleshoot data plane selection.
There are three methods that the client utilizes via DNS to select the best location in order of preference:
NewEdge Traffic Management 2.0
NewEdge Traffic Management 2.0 is the newer, more robust DC selection method that was announced in release R96. It is currently in limited availability as of September 2022 and can be enabled on select development or lab tenants. NewEdge Traffic Management 2.0 improves on 1.0 by introducing evaluation of real-time client to DC latency (round trip time or RTT), understanding the actual client location, and provides an extensible method for even more dynamic steering decisions in the future. By examining client logs, we can take a look at what the client and NewEdge Traffic Management service are actually doing. It begins with the client making a REST API call to retrieve available DCs:
stAgentSvc paa4 tcec info config.cpp:296 Config Initialize gslb for endpoint:gateway.gslb.goskope.com
stAgentSvc paa4 tcec info config.cpp:307 Config fetching POPs from gslb
stAgentSvc paa4 tcec info restapi.cpp:74 restapi Downloading gslb pop info
stAgentSvc paa4 tcec info restapi.cpp:86 restapi gslb pop info downloaded successfully
The API call returns a list of the 10 closest data planes to the client based on a Geolocation database lookup of the client source IP address. The client then evaluates the round trip time to each DC (sample of three below):
stAgentSvc paa4 tcec info GatewaySelection.cpp:213 gslb post client rtt pop:US-SEA1 ip:22.214.171.124 rtt:9
stAgentSvc paa4 tcec info GatewaySelection.cpp:213 gslb post client rtt pop:US-LAX1 ip:126.96.36.199 rtt:30
stAgentSvc paa4 tcec info GatewaySelection.cpp:213 gslb post client rtt pop:US-PHX1 ip:188.8.131.52 rtt:38
In the case above, the client is located in Seattle and you can see the latency increase with the more distant data planes such as Dallas and Atlanta:
stAgentSvc paa4 tcec info GatewaySelection.cpp:213 gslb post client rtt pop:US-DFW1 ip:184.108.40.206 rtt:62
stAgentSvc paa4 tcec info GatewaySelection.cpp:213 gslb post client rtt pop:US-ATL1 ip:220.127.116.11 rtt:82
Once the evaluation is complete, the client informs the Netskope GSLB service of its RTT results, the service responds with the optimal DC, and the client establishes a tunnel (Seattle in this case). Generally this is the DC with the lowest RTT. Keep in mind, the service may occasionally pick a different DC depending on network and other conditions. See the following log message as an example:
stAgentSvc paa4 tcec info tunnel.cpp:902 nsTunnel DTLS Connecting to gateway-sea1.goskope.com:443
One other note and without going too deep into the weeds, the NewEdge Traffic Management 2.0 service also has some logic to prefer in-country DCs for content localization, compliance, and performance purposes. Additionally, Netskope NewEdge is a dynamic network so data planes under maintenance or in a degraded state are removed from the available list as needed. This happens transparently to the end user.
NewEdge Traffic Management 1.0
The Netskope client has long used DNS over HTTPS as the preferred method for data plane selection. The major benefit of DNS over HTTPS is that it provides support for Extension Mechanisms for DNS (EDNS/ECS) to provide additional client info in the DNS request between the last-hop DNS resolver and NewEdge Traffic Management. This includes a subnet (/24) of the client’s egress IP address which allows for more accurate geolocation info and has excellent accuracy. However, it still relies on geolocation as the primary method of DC selection. As we migrate to NewEdge Traffic Management 2.0, it’s important to understand how EDNS works as it may still come up in your day-to-day administration of Netskope. For tenants using EDNS, the client begins by attempting a DNS over HTTPS call:
stAgentSvc p69e0 t2a4c info restapi.cpp:72 restapi Downloading SSL resolve EDNS
If the DNS over HTTPS call succeeds, the client will reflect that the gateway domain was successfully resolved via EDNS:
stAgentSvc p69e0 t2a4c info nsDnsResolver.cpp:203 dnsResolver Hostname gateway-tenantname.goskope.com resolved by EDNS
stAgentSvc p69e0 t2fd0 info nsssl.cpp:1263 nsssl DTLS remote host gateway-tenantname.goskope.com resolved to X.X.X.X, port 443
The client will then make a connection to the provided data plane via TLS or DTLS depending on your configuration:
stAgentSvc p69e0 t2fd0 info tunnel.cpp:837 nsTunnel DTLS SSL connected to the server: gateway-tenantname.goskope.com:443 successfully
The last and least preferred method for DC selection is using regular DNS. This method for DC selection is only used when either of the previous methods fail. The major drawback to local DNS is that when Netskope’s DNS provides a response for the tenant gateway domain, it’s basing the response solely off the egress IP of where the DNS request came from. If your user is at home and going directly to the internet, this isn’t a big deal, but most enterprises use centralized DNS over a remote access VPN or site-to-site tunnels. So, imagine a scenario where your enterprise DNS servers are in Miami, Florida but you’re working remotely in Seattle. Because the DNS request comes from your central DNS servers, a Seattle user could end up being directed to the Miami DC. This is rare for non-enterprise networks, but you should still be aware of it as its suboptimal DC selection can cause performance issues. In the client logs you will see an entry that notes this failure:
NewEdge Traffic Management 1.0:
stAgentSvc p7aac t50f0 error restapi.cpp:81 restapi Failed to download SSL resolve EDNS, Error: -5
NewEdge Traffic Management 2.0:
stAgentSvc p46d8 t6528 info GatewaySelection.cpp:325 gslb gslb apis failed
The preferred method (DNS over HTTPS or GSLB) failed, so the client makes a DNS request for gateway-<tenantname>.goskope.com using local DNS (LDNS):
stAgentSvc p46d8 t6528 info tunnel.cpp:208 nsTunnel use LDNS resolver type
stAgentSvc p46d8 t6528 info tunnel.cpp:905 nsTunnel DTLS Connecting to gateway-tenantname.goskope.com:443
stAgentSvc p46d8 t6528 info nsDnsResolver.cpp:47 dnsResolver Hostname gateway-tenantname.goskope.com resolved by LDNS
Once resolved, the client then establishes it’s tunnel to the selected DC:
stAgentSvc p46d8 t6528 info nsssl.cpp:1276 nsssl DTLS remote host gateway-tenantname.goskope.com resolved to X.X.X.X
The good news here is that ultimately our client connected and our security services apply, but we now have added latency as the client is connecting to a suboptimal DP, hence this is a less preferred DC selection method.
This article provided a very brief overview of Netskope’s data plane selection mechanisms and the client logs you can use to see this in action. As of this writing (October 2022), most Netskope tenants are utilizing NewEdge Traffic Management 1.0. Reach out to your local account team to learn more about 2.0 or if you’re interested in testing NewEdge Traffic Management 2.0 in a development tenant (non-production traffic). As mentioned earlier, other Netskope services such as SMTP proxy and explicit proxy utilize DNS based methods for optimal Data Plane and Management Plane selection.