utf8_uri_encode()WP 1.5.0

Encode the Unicode values to be used in the URI.

No Hooks.

Return

String. String with Unicode encoded for URI.

Usage

utf8_uri_encode( $utf8_string, $length, $encode_ascii_characters );
$utf8_string(string) (required)
String to encode.
$length(int)
Max length of the string
$encode_ascii_characters(true|false)
Whether to encode ascii characters such as < " '
Default: false

Examples

0

#1 Create a correct URI for non-latin characters:

$utf8_string = "http://example.com/ссылка-на_istochnik";

echo utf8_uri_encode( $utf8_string );
// вернет:
// http://example.com/%d1%81%d1%81%d1%8b%d0%bb%d0%ba%d0%b0-%d0%bd%d0%b0_istochnik

Changelog

Since 1.5.0 Introduced.
Since 5.8.3 Added the encode_ascii_characters parameter.

utf8_uri_encode() code WP 6.4.3

function utf8_uri_encode( $utf8_string, $length = 0, $encode_ascii_characters = false ) {
	$unicode        = '';
	$values         = array();
	$num_octets     = 1;
	$unicode_length = 0;

	mbstring_binary_safe_encoding();
	$string_length = strlen( $utf8_string );
	reset_mbstring_encoding();

	for ( $i = 0; $i < $string_length; $i++ ) {

		$value = ord( $utf8_string[ $i ] );

		if ( $value < 128 ) {
			$char                = chr( $value );
			$encoded_char        = $encode_ascii_characters ? rawurlencode( $char ) : $char;
			$encoded_char_length = strlen( $encoded_char );
			if ( $length && ( $unicode_length + $encoded_char_length ) > $length ) {
				break;
			}
			$unicode        .= $encoded_char;
			$unicode_length += $encoded_char_length;
		} else {
			if ( count( $values ) === 0 ) {
				if ( $value < 224 ) {
					$num_octets = 2;
				} elseif ( $value < 240 ) {
					$num_octets = 3;
				} else {
					$num_octets = 4;
				}
			}

			$values[] = $value;

			if ( $length && ( $unicode_length + ( $num_octets * 3 ) ) > $length ) {
				break;
			}
			if ( count( $values ) === $num_octets ) {
				for ( $j = 0; $j < $num_octets; $j++ ) {
					$unicode .= '%' . dechex( $values[ $j ] );
				}

				$unicode_length += $num_octets * 3;

				$values     = array();
				$num_octets = 1;
			}
		}
	}

	return $unicode;
}