I considered posting this here since my comment in the following post might be not seen:
Netskope DLP use cases for ChatGPT - The Netskope Community
We found the reason DLP is not working for chatGPT.
ChatGPT input is sent as a HTTP POST, sending the message inside the JSON payload.
The problem comes when the input text contains newline characters.
Nweline characters are sent as '\n' and Netskope does not treat these as newline but as a literal '\' and an 'n'.
This means if I send for instance a list of card numbers or SSN numbers as follows:
What is being actually sent is "132412341234\n132412341234".
Netskope sets implicitly word boundaries to entities, this means word boundary characters (dot, coma, semicolon, whitespace, etc) must be found before and after the sensitive data to match.
Considering the format the data is sent to chatgpt, what netskope sees is:
"132412341234" and \ "n132412341234".
Thus the second SSN numbers won't match because of the start 'n' .
If you send the same SSN separated by whitespace instead of newline it matches correctly.
We have reported this to our TAM as well asking for an ER to allow us to decide if word boundary are desired or not.
Regarding how is netskope supposed to read '\n', this is more complicated matter. The first approach should do the trick.
I hope my explanation is clear. Please let me know if you need more info or details.
Thanks and regards,