Solved

[DLP] ChatGPT DLP is not working properly (newline )

  • 28 April 2023
  • 1 reply
  • 99 views

Badge +12
  • Netskope Partner
  • 21 replies

Hello all,

 

I considered posting this here since my comment in the following post might be not seen:

Netskope DLP use cases for ChatGPT - The Netskope Community

 

We found the reason DLP is not working for chatGPT.

ChatGPT input is sent as a HTTP POST, sending the message inside the JSON payload.

The problem comes when the input text contains newline characters.

Nweline characters are sent as ' ' and Netskope does not treat these as newline but as a literal '' and an 'n'.

This means if I send for instance a list of card numbers or SSN numbers as follows:

132412341234

132412341234

 

What is being actually sent is "132412341234 132412341234".

Netskope sets implicitly word boundaries to entities, this means word boundary characters (dot, coma, semicolon, whitespace, etc) must be found before and after the sensitive data to match.

 

Considering the format the data is sent to chatgpt, what netskope sees is:

"132412341234" and "n132412341234".

Thus the second SSN numbers won't match because of the start 'n' .

 

If you send the same SSN separated by whitespace instead of newline it matches correctly.

 

We have reported this to our TAM as well asking for an ER to allow us to decide if word boundary are desired or not.

Regarding how is netskope supposed to read ' ', this is more complicated matter. The first approach should do the trick.

 

I hope my explanation is clear. Please let me know if you need more info or details.

Thanks and regards,

Òscar

icon

Best answer by oscar 2 May 2023, 12:49

View original

1 reply

Badge +12

Hi all,

 

please find a workaround here:

https://community.netskope.com/t5/Inside-Netskope-Security/Netskope-DLP-use-cases-for-ChatGPT/m-p/4186/highlight/true#M20

 

I hope it is helpful.

 

Kind regards,

Òscar

Reply