Looking for the vulnerability index of Invicti's legacy products?
Unicode Transformation (Best-Fit Mapping) - Vulnerability Database

Unicode Transformation (Best-Fit Mapping)

Description

This vulnerability occurs when the application improperly handles Unicode character transformations, including Best-Fit Mappings (where one character is incorrectly converted to a different character during encoding conversion), Overlong UTF-8 sequences (characters encoded using more bytes than necessary, such as representing U+000A as 0xC0 0x8A instead of 0x0A), and Ill-Formed sequences (invalid byte combinations in UTF-8 streams). These transformation issues can allow attackers to bypass security filters and validation mechanisms by exploiting inconsistencies in how different components process Unicode data.

Remediation

Implement the following measures to prevent Unicode transformation vulnerabilities:

1. Enforce Strict UTF-8 Validation: Configure your application to reject overlong UTF-8 sequences and accept only the shortest form encoding for each character. Ensure UTF-8 decoders validate that each character uses the minimum number of bytes required.

2. Normalize Unicode Input: Apply Unicode normalization (NFC or NFKC) to all user input before processing or validation to ensure consistent character representation.

Example (Java):

import java.text.Normalizer;

String userInput = request.getParameter("input");
// Normalize to NFC form
String normalized = Normalizer.normalize(userInput, Normalizer.Form.NFC);
// Validate the normalized input
if (!isValid(normalized)) {
    throw new SecurityException("Invalid input detected");
}

3. Use Encoding-Aware Security Libraries: Employ security libraries that properly handle Unicode transformations when performing input validation, output encoding, and sanitization.

4. Avoid Best-Fit Mappings: When converting between character encodings, use strict conversion modes that reject unmappable characters rather than substituting them with best-fit alternatives.

5. Validate After Transformation: Perform security validation after all character encoding transformations are complete, not before, to ensure the final character representation is safe.

Related Vulnerabilities