WP_REST_URL_Details_Controller::get_meta_with_content_elements │ private │ WP 5.9.0

Gets all the meta tag elements that have a 'content' attribute.

Method of the class: WP_REST_URL_Details_Controller{}

No Hooks.

Returns

Array. A multidimensional indexed array on success, else empty array.

Usage

// private - for code of main (parent) class only
$result = $this->get_meta_with_content_elements( $html );

$html(string) (required): The string of HTML to be parsed.

Changelog

Since 5.9.0

Introduced.

`WP_REST_URL_Details_Controller::get_meta_with_content_elements() WP REST URL Details Controller::get meta with content elements` code ^{WP 7.0}

wp-includes/rest-api/endpoints/class-wp-rest-url-details-controller.php

private function get_meta_with_content_elements( $html ) {
	/*
	 * Parse all meta elements with a content attribute.
	 *
	 * Why first search for the content attribute rather than directly searching for name=description element?
	 * tl;dr The content attribute's value will be truncated when it contains a > symbol.
	 *
	 * The content attribute's value (i.e. the description to get) can have HTML in it and be well-formed as
	 * it's a string to the browser. Imagine what happens when attempting to match for the name=description
	 * first. Hmm, if a > or /> symbol is in the content attribute's value, then it terminates the match
	 * as the element's closing symbol. But wait, it's in the content attribute and is not the end of the
	 * element. This is a limitation of using regex. It can't determine "wait a minute this is inside of quotation".
	 * If this happens, what gets matched is not the entire element or all of the content.
	 *
	 * Why not search for the name=description and then content="(.*)"?
	 * The attribute order could be opposite. Plus, additional attributes may exist including being between
	 * the name and content attributes.
	 *
	 * Why not lookahead?
	 * Lookahead is not constrained to stay within the element. The first <meta it finds may not include
	 * the name or content, but rather could be from a different element downstream.
	 */
	$pattern = '#<meta\s' .

			/*
			 * Allows for additional attributes before the content attribute.
			 * Searches for anything other than > symbol.
			 */
			'[^>]*' .

			/*
			* Find the content attribute. When found, capture its value (.*).
			*
			* Allows for (a) single or double quotes and (b) whitespace in the value.
			*
			* Why capture the opening quotation mark, i.e. (["\']), and then backreference,
			* i.e \1, for the closing quotation mark?
			* To ensure the closing quotation mark matches the opening one. Why? Attribute values
			* can contain quotation marks, such as an apostrophe in the content.
			*/
			'content=(["\']??)(.*)\1' .

			/*
			* Allows for additional attributes after the content attribute.
			* Searches for anything other than > symbol.
			*/
			'[^>]*' .

			/*
			* \/?> searches for the closing > symbol, which can be in either /> or > format.
			* # ends the pattern.
			*/
			'\/?>#' .

			/*
			* These are the options:
			* - i : case-insensitive
			* - s : allows newline characters for the . match (needed for multiline elements)
			* - U means non-greedy matching
			*/
			'isU';

	preg_match_all( $pattern, $html, $elements );

	return $elements;
}

WP_REST_URL_Details_Controller::get_meta_with_content_elements │ private │ WP 5.9.0

Returns

Usage

Changelog

WP_REST_URL_Details_Controller::get_meta_with_content_elements() WP REST URL Details Controller::get meta with content elements code WP 7.0

`WP_REST_URL_Details_Controller::get_meta_with_content_elements() WP REST URL Details Controller::get meta with content elements` code ^{WP 7.0}