OUR THOUGHTS: XSS (Cross-Site Scripting)

Brief:

While coding, we add validations on entry points to your system. Validations are very important to avoid invalid data to make the system secure. If such validations are not implemented and hacker enters any malicious scripts, it will get stored in the system. When retrieving this data, these scripts might get run and the malicious code will hamper the ideal flow of the application. This is hazardous for any application system as it might lead to big data loss resulting in user faith on the system. This can even crash the whole system and stop functioning.

Need:

To avoid this, in one of our project, we have implemented XSS (Cross-Site Scripting). This will identify the malicious script that is being added by the user and avoid it being getting added to the system. Ultimately the script will not get run and fail the attackers attempt to hack the system.

How It Works:

We have added several rules to implement Cross-Site Scripting. We have made a common config files from where we can enable/disable the CSS rules. Rules are as follows:

We are not accepting technical keywords in any input fields like - 'javascript', 'expression', 'vbscript', 'jscript', 'wscript', 'vbs', 'script', 'base64', 'applet', 'alert', 'document', 'write', 'cookie', 'window', 'confirm', 'prompt', 'eval'

    Hackers can add scripts like:

    <html>
    <h1>Most recent comment</h1>
    <script>doSomethingEvil();</script>
    </html>

We are avoiding invisible characters like: 'x00-' 'x08' 'x0B' 'x0C' 'x0E-' 'x1F' 'x7F'. When you add these words in input fields, these gets stored as a invisible characters. So hackers might add invisible characters between keywords J A V A S C R I P T

    function remove_invisible_characters()
    {
      $pattern = '#'
     .'<((?<slash>/*\s*)(?<tagName>[a-z0-9]+)(?=[^a-z0-9]|$)'// tag start and name, followed by a non-tag character
      .'[^\s\042\047a-z0-9>/=]*' // a valid attribute character immediately after the tag would count as a separator
      // optional attributes
      . '(?<attributes>(?:[\s\042\047/=]*' // non-attribute characters, excluding > (tag close) for obvious reasons
      . '[^\s\042\047>/=]+' // attribute characters
      // optional attribute-value
      . '(?:\s*=' // attribute-value separator
      // single, double or non-quoted value
      . '(?:[^\s\042\047=><`]+|\s*\042[^\042]*\042|\s*\047[^\047]*\047|\s*(?U:[^\s\042\047=><`]*))'
      . ')?' // end optional attribute-value group
      . ')*)' // end optional attributes group
      . '[^>]*)(?<closeTag>\>)?#isS';

      // Note: It would be nice to optimize this for speed, BUT
      //       only matching the naughty elements here results in
      //       false positives and in turn - vulnerabilities!

        do {
            $old_str = $str;
            $str = preg_replace_callback($pattern, array($this, '_sanitize_naughty_html'), $str);
        } while ($old_str !== $str);

    }

Some characters are never allowed to be added through input fields such as '#', '#is', '[removed]'

     protected function _do_never_allowed($str) {
            $str = str_replace(array_keys($this->_never_allowed_str), $this->_never_allowed_str, $str);

            foreach ($this->_never_allowed_regex as $regex) {
                $str = preg_replace('#' . $regex . '#is', '[removed]', $str);
            }

            return $str;
        }

Remove naughty HTML elements like:

'alert', 'prompt', 'confirm', 'applet', 'audio', 'basefont', 'base', 'behavior', 'bgsound', 'blink', 'body', 'embed', 'expression', 'form', 'frameset', 'frame', 'head', 'html', 'ilayer', 'iframe', 'input', 'button', 'select', 'isindex', 'layer', 'link', 'meta', 'keygen', 'object', 'plaintext', 'style', 'script', 'textarea', 'title', 'math', 'video', 'svg', 'xml', 'xss'

static $naughty_tags = array(

);

            static $evil_attributes = array(
                'on\w+', 'style', 'xmlns', 'formaction', 'form', 'xlink:href', 'FSCommand', 'seekSegmentTime'
            );

            // First, escape unclosed tags
            if (empty($matches['closeTag'])) {
                return '<' . $matches[1];
            }
            // Is the element that we caught naughty? If so, escape it
            elseif (in_array(strtolower($matches['tagName']), $naughty_tags, TRUE)) {
                return '<' . $matches[1] . '>';
            }

We are removing javascript links and javascript image paths. Keywords for this are |livescript:|mocha:|charset=|window\.|document\.|\.cookie|<script|<xss|base64\s*,

      protected function _js_link_removal($match) {
            return str_replace(
                    $match[1], preg_replace(
                            '#href=.*?(?:(?:alert|prompt|confirm)(?:\(|&\#40;)|javascript:|
                           livescript:|mocha:|charset=|window\.|document\.|\.cookie|<script
                           |<xss|data\s*:)#si', '', $this->_filter_attributes($match[1])
                    ), $match[0]
            );
        }

     protected function _js_img_removal($match) {
            return str_replace(
                    $match[1], preg_replace(
                            '#src=.*?(?:(?:alert|prompt|confirm|eval)(?:\(|&\#40;)|javascript:|
                           livescript:|mocha:|charset=|window\.|document\.|\.cookie|<script|
                           <xss|base64\s*,)#si', '', $this->_filter_attributes($match[1])
                    ), $match[0]
            );
        }

XSS (Cross-Site Scripting)

Boston Byte Grabs a Spot in Clutch’s List of Top Software Developers in Massachusetts