Sanitizing file names

Gentics Content.Node offers central translation table that allows you to configure sanitization of file names.

Chapters

  1. Overview

1 Overview

Gentics Content.Node will use the $SANITIZE_CHARACTER array from the node.conf file to transform special characters from filenames and folder paths. This configuration is also used with the Aloha Editor headerids plugin to generate header ids from text contents.

The default setting for sanitizing characters are:


	"ü" => "ue", 
	"ä" => "ae", 
	"ö" => "oe", 
	"Ü" => "Ue", 
	"Ä" => "Ae", 
	"Ö" => "Oe", 
	"ß" => "ss",
	" " => "_"

This will transform strings as follows:


	"äöï 23.jpg" => "aeoe__23.jpg"
	"ia 23$%.html" => "ia_23__.html"

You can redefined the pre-defined set of replacements or just add new ones in “node.conf” file like this:

/Node/etc/node.conf

$SANITIZE_CHARACTER["ï"] = "i";
$SANITIZE_CHARACTER["ä"] = "ae";

Do not replace any character by “/” or “\”, since those are separators for path names.

This only works with UTF-8 and all replacement characters should contain only alphanumeric characters or underscore.

Replacements to non-alphanumeric characters are not supported:

/Node/etc/node.conf

$SANITIZE_CHARACTER["a"] = "ä"; // ä will be replaced by _
$SANITIZE_CHARACTER["e"] = "ë"; // ë will be replaced by _