Word Cloud - Formatting


Userlevel 5
Badge +16

Several of the examples in the Netskope Library with Word Clouds visualization are for Justification.  As these are largely user-generated, they tend to be nearly universally unique.  Has there been any consideration of tokenizing these into either individual words (discarding filler words) or shorter phrases?   A drill-in on these could then expose the justifications that contain the selected word/phrase.


2 replies

Userlevel 5
Badge +16

HI @qyost 

Yes! We've played around with this in the past. Specifically, we have looked at creating custom dimensions to summarize the justification reasons based on common appearance of wording. For example, we would look for phrases like "tax", "w-2", "personal", "n/a", "not applicable", "enter justification reason here" and so on. Here is the logic I've used in the past if you want to leverage it: 

 

 

coalesce(
if(contains(lower(${app_event.justification_reason}), "individual tax"), "...individual tax...", null),
if(contains(lower(${app_event.justification_reason}), "personal documents"), "...personal data...", null),
if(contains(lower(${app_event.justification_reason}), "personal Data"), "...personal data...", null),
if(contains(${app_event.justification_reason}, "Personal Documents"), "...personal data...", null),
if(contains(lower(${app_event.justification_reason}), "not fmi data"), "...personal data...", null),
if(contains(lower(${app_event.justification_reason}), "no phi"), "...no PHI..", null),
if(contains(lower(${app_event.justification_reason}), "not phi"), "...no PHI..", null),
if(contains(lower(${app_event.justification_reason}), "not fmi confidential or phi"), "...no PHI..", null),
if(contains(lower(${app_event.justification_reason}), "isn't phi"), "...no PHI..", null),
if(contains(${app_event.justification_reason}, "personal documents"), "...personal data...", null),
if(contains(lower(${app_event.justification_reason}), "personal file"), "...personal data...", null),
if(contains(lower(${app_event.justification_reason}), "personal email"), "...personal data...", null),
if(contains(lower(${app_event.justification_reason}), "testing"), "...testing...", null),

if(contains(lower(${app_event.justification_reason}), "need to print"), "...need to print...", null),
if(contains(lower(${app_event.justification_reason}), "printing"), "...printing...", null),
if(contains(lower(${app_event.justification_reason}), "tax"), "...tax...", null),
if(contains(lower(${app_event.justification_reason}), "personal"), "...personal...", null),
${app_event.justification_reason}
)

 

Userlevel 5
Badge +16

Eww.   That's a lot more work than I was hoping for on the user side.   
I was thinking of something more along the lines of the following happening on the back-end.
  Quickly find common phrases in a large list of strings - DEV Community

Reply