_mb_substr()
Internal compat function to mimic mb_substr().
Only supports UTF-8 and non-shifting single-byte encodings. For all other encodings expect the substrings to be misaligned. When the given encoding (or the blog_charset if none is provided) isn’t UTF-8 then the function returns the output of {@see \substr()}.
Internal function — this function is designed to be used by the kernel itself. It is not recommended to use this function in your code.
No Hooks.
Returns
String. Extracted substring.
Usage
_mb_substr( $str, $start, $length, $encoding );
- $str(string) (required)
- The string to extract the substring from.
- $start(int) (required)
- Character offset at which to start the substring extraction.
- $length(int|null)
- Maximum number of characters to extract from
$str.
Default:null - $encoding(string|null)
- Character encoding to use.
Default:null
Changelog
| Since 3.2.0 | Introduced. |
_mb_substr() mb substr code WP 7.0
function _mb_substr( $str, $start, $length = null, $encoding = null ) {
if ( null === $str ) {
return '';
}
// The solution below works only for UTF-8; treat all other encodings as byte streams.
if ( ! _is_utf8_charset( $encoding ?? get_option( 'blog_charset' ) ) ) {
return is_null( $length ) ? substr( $str, $start ) : substr( $str, $start, $length );
}
$total_length = ( $start < 0 || $length < 0 )
? _wp_utf8_codepoint_count( $str )
: 0;
$normalized_start = $start < 0
? max( 0, $total_length + $start )
: $start;
/*
* The starting offset is provided as characters, which means this needs to
* find how many bytes that many characters occupies at the start of the string.
*/
$starting_byte_offset = _wp_utf8_codepoint_span( $str, 0, $normalized_start );
$normalized_length = $length < 0
? max( 0, $total_length - $normalized_start + $length )
: $length;
/*
* This is the main step. It finds how many bytes the given length of code points
* occupies in the input, starting at the byte offset calculated above.
*/
$byte_length = isset( $normalized_length )
? _wp_utf8_codepoint_span( $str, $starting_byte_offset, $normalized_length )
: ( strlen( $str ) - $starting_byte_offset );
// The result is a normal byte-level substring using the computed ranges.
return substr( $str, $starting_byte_offset, $byte_length );
}