Metadata in WordPress

Under the word "metadata" in WordPress, there is a huge amount of work and amazing logic, which you can scold and praise at the same time, but one thing you can't take away from metadata - they are very handy. At the same time, figuring out how they work is pretty easy.

Read also how taxonomies in WordPress are arranged
Read also how posts in WordPress are arranged

What is metadata?

Metadata in WordPress is data that complements basic data. They are also called meta-fields, arbitrary fields, custom fields. In other words, metadata is an additional table in the database that extends the main table.

For example:

  • Posts has basic data: content, title, ... and there can be additional data (metadata), e.g. the number of times the post was viewed, the thumbnail ID, the ID of the user who edited it, etc.

  • Users have basic data from the wp_users table: login, link to the site, email, ... and metadata: biography, link to social profile, admin panel settings.
Post Metadata
User Metadata

Metadata Tables in the WP Database

By default, there are 5 tables in WordPress for different objects (posts, comments, users, taxonomy items, sites):

wp_postmeta
Posts - table wp_postmeta for wp_posts. This is where the usual WordPress "arbitrary post fields" are written.
wp_usermeta
Users - table wp_usermeta for wp_users. Additional data about a user.
wp_termmeta(since WP 4.4)
Terms (taxonomy elements) - table wp_termmeta for wp_terms. Additional data for taxonomy elements.
wp_commentmeta.
Comments - table wp_commentmeta for wp_comments. Metadata for each comment.
wp_sitemeta(for multisite builds)
Sites (main network site in multisite) - table wp_sitemeta for wp_site. Main network site options.
Linking metadata tables to their main tables

All metadata have the same logic and tables of the same structure in the database.

wp_postmeta
wp_termmeta
wp_usermeta
wp_commentmeta

As you can see, the structure of all tables is the same, the only difference is the name of the main fields. They are linked to the main table through the second field (all tables have different names: post_id, user_id, comment_id, term_id). The main key (yellow) is not usually used and is only needed to accurately identify specific rows in the table (this is sometimes necessary). Queries generally work on three fields: 2, 3, 4.

By default, WordPress actively uses metadata tables for posts and for users. Less active is the metadata table for comments - the time of deletion is written there when a comment is moved to trash (if the trash is disabled, the table is not used). And the metadata table for terms is not used at all.

All metadata tables are great for extending core capabilities. And they are actively used by plugin and theme developers.

The metadata table for terms was added in version 4.4 specifically for developers to have a place to store the necessary data for terms.

Hidden (protected) meta-fields

In WordPress there is such a concept as hidden meta-fields. These are fields whose name (meta_key) begins with an underscore _. So it is customary to call the meta-fields that are used for the needs of the code and should not be changed manually.

In the admin, for posts in the block "Custom fields" (when Gutenberg not used) hidden meta fields are not displayed, and therefore they can not be changed. So, for example, when editing a post, the post meta field _edit_lock contains the timestamp and ID of the user who edits the post. Thanks to this we can see when the post is currently being edited by another user. Another example, the ID of an attachment image which is set as the thumbnail of a post is stored in the _thumbnail_id metafield.

There is a special function to find out if a meta-field is hidden: is_protected_meta( $meta_key, $meta_type ).

To make custom meta-field hidden, there is a hook is_protected_meta:

// Hide some meta-fields
add_filter( 'is_protected_meta', 'my_protected_custom_fields', 10, 2 );

function my_protected_custom_fields( $protected, $meta_key ){

	if( in_array( $meta_key, [ 'any_field' ] ) )
		return true;

	return $protected;
}

Metadata functions

Almost all metadata functions are based on four basic functions. These four functions are essentially the basis of the API for working with all metadata in WordPress.

For the posts:

For users:

For comment:

For taxonomies (terms):

You can get the values of all the object's meta-fields using the get_***_meta() function. To do this, you need to specify only the first parameter: the object's id:

$metas = get_post_meta( 76 );
/*
Array(
	[_edit_lock] => Array
			[0] => 1517175359:1

	[_edit_last] => Array
			[0] => 1

	[views] => Array
			[0] => 10164

	[_thumbnail_id] => Array
			[0] => 9556

	[photo] => Array
			[0] => https://example.com/wp-content/uploads/2010/03/Quicktags-API.png
			[1] => https://example.com/wp-content/uploads/2017/07/image.png
)
*/

Cleaning meta-field values when saving

The value of any meta-field can be sanitized through the filter: sanitize_(type)_meta_(meta_key).

This filter is always triggered when adding or updating a meta field value.

All variants of the filter if you specify the first parameter:

  • sanitize_post_meta_(meta_key)
  • sanitize_user_meta_(meta_key)
  • sanitize_comment_meta_(meta_key)
  • sanitize_term_meta_(meta_key)
Example of using the filter

Suppose we have a user my_history meta-field. In this field the user can write some text, but it is not allowed to use HTML tags. To make sure no tags are inserted, it is best to clear the value of the field before saving it to the database:

add_filter( 'sanitize_user_meta_'.'my_history', function( $meta_value ){
	return wp_strip_all_tags( $meta_value );
} );

Registration of meta-fields

Since WordPress version 4.6, it is possible to additionally describe each meta-field. This is done through the register_meta() function.

Registration of a custom-field is needed to use it in different APIs, e.g. for the REST API (full support for the REST API is expected from WP 5.0).

The behavior of register_meta() is somewhat similar to register_post_type() - the data is similarly saved to the global variable $wp_meta_keys. This allows you to retrieve the custom field data at any time when writing code.

Example registration of a meta-field

Let's register a meta-field for a post with access and cleanup functions:

register_meta( 'post', 'bookname', [
	'type'              => 'string',
	'description'       => 'Book title',
	'single'            => true,
	'sanitize_callback' => function( $meta_value, $meta_key, $object_type ){
		return wp_strip_all_tags( $meta_value ); // remove HTML tags
	},
	'auth_callback'     => function( $false, $meta_key, $postID, $user_id, $cap, $caps ){
		// forbid creation and editing of this meta-field for everyone except the admin
		return current_user_can('manage_options');
	},
	'show_in_rest'      => false,
] );

As a result, if you go to the post editing page and try to create a meta-field bookname:

  • If you are logged in as an admin, the meta-field will be created.
  • If you are an editor, author, etc. - you will not be able to create this meta-field.

The parameter auth_callback is responsible for this.

Further, if you specify a string with HTML tags in the value, they will be cut out when updating.

The parameter sanitize_callback is responsible for this.*

Parameters: type, description, show_in_rest in WP 4.6, are only informational and are not used anywhere yet. And since WP 5.0 are used in the REST API. For example, if you specify show_in_rest=true, the metafield can be viewed or edited via REST API.

Other functions related to meta-field registration:

Meta fields for custom DB table

The metafields API allows you to create your own meta-table for any table. Let's look at an example.

Suppose we have a table my_books:

Create a metadata table my_bookmeta for it:

To create a metadata table, you need to run the function create_book_meta_table() (see the function code below). The function code is:

## register_activation_hook( __FILE__, 'create_book_meta_table');
## Function to create a metadata table. It needs to be run once.
## Can be hung on register_activation_hook()
function create_book_meta_table(){
	global $wpdb;

	$collate = '';
	if ( ! empty($wpdb->charset) ) $collate  = "DEFAULT CHARACTER SET $wpdb->charset";
	if ( ! empty($wpdb->collate) ) $collate .= " COLLATE $wpdb->collate";

	/*
	 * Indexes have a maximum size of 767 bytes. Historically, we haven't need to be concerned about that.
	 * As of 4.2, however, we moved to utf8mb4, which uses 4 bytes per character. This means that an index which
	 * used to have room for floor(767/3) = 255 characters, now only has room for floor(767/4) = 191 characters.
	 */
	$max_index_length = 191;

	$main_field = 'book_id'; // the name of the main column, should look like: $meta_type . '_id'
	$table_name = 'my_bookmeta';

	$wpdb->query(
		"CREATE TABLE $table_name (
			meta_id      bigint(20)   unsigned NOT NULL auto_increment,
			$main_field  bigint(20)   unsigned NOT NULL default '0',
			meta_key     varchar(255)                   default NULL,
			meta_value   longtext,
			PRIMARY KEY  (meta_id),
			KEY $main_field ($main_field),
			KEY meta_key (meta_key($max_index_length))
		) $collate;"
	);
}

The name of the main column should look like this: $meta_type . '_id'. This is how it is used in the functions: *_metadata().

Let's register functions for working with book metadata:

function add_book_meta( $id, $meta_key, $meta_value, $unique = false ) {
	return add_metadata( 'book', $id, $meta_key, $meta_value, $unique );
}

function delete_book_meta( $id, $meta_key, $meta_value = '' ) {
	return delete_metadata( 'book', $id, $meta_key, $meta_value );
}

function get_book_meta( $id, $meta_key = '', $single = false ) {
	return get_metadata( 'book', $id, $meta_key, $single );
}

function update_book_meta( $id, $meta_key, $meta_value, $prev_value = '' ){
	return update_metadata( 'book', $id, $meta_key, $meta_value, $prev_value );
}

That's it!

Now we can use functions to manage metadata that will work exactly like metadata in WordPress. This includes metadata caching and all types of hooks.

For example:

// add data to the metadata table
update_book_meta( 12, 'author_name', 'Zircon' );

// get the value of the meta-field
get_book_meta( 12, 'author_name', 1 ); //> Zircon

// get the values of all metfields
get_book_meta( 12 ); //> returns an array
Building a query using metadata

In WordPress, it is very convenient to select or sort the rows of the main table, based on metadata parameters. For example, in WP_query() this is done through parameter meta_query.

You can attach the same functionality to our table using WP_Meta_Query{} class. I'll write, for example, a function for getting books, with the ability to select by metadata:

// set tables in $wpdb
global $wpdb;
$wpdb->books    = "my_books";
$wpdb->bookmeta = "my_bookmeta";

## Example function for retrieving books, with metadata sampling capability
function get_books( $args = array() ){
	global $wpdb;

	$default = [
		'book_id'        => 0,
		'name'           => '',
		'content_search' => '',
		// interpreted meta-parameters
		'meta_key'       => '',
		'meta_value'     => '',
		'meta_value_num' => '',
		'meta_compare'   => '',
		'meta_query'     => array(),
	];

	$args = array_merge( $default, $args );

	$WHERE = array();
	$JOIN = $ORDER_BY = $LIMIT = '';

	if( $args['book_id'] ){
		// 'my_books.' it is necessary because the field has the same name for the main and meta table
		$WHERE[] = $wpdb->prepare('my_books.book_id = %d', $args['book_id'] );
	}

	if( $args['name'] ){
		$WHERE[] = $wpdb->prepare('name = %s', $args['name'] );
	}

	if( $args['content_search'] ){
		$WHERE[] = $wpdb->prepare('content LIKE %s', '%'. $wpdb->esc_like( $args['content_search'] ) .'%' );
	}

	// meta query
	if( $args['meta_query'] || $args['meta_key'] ){
		$metaq = new WP_Meta_Query();
		$metaq->parse_query_vars( $args ); // parse possible meta-parameters from parameters $args

		// first parameter 'book' must be the beginning of $wpdb->bookmeta property without 'meta' suffix.
		// I.e. we specify 'book' with 'meta' and 'bookmeta' property must exist in $wpdb.
		// see. https://wp-kama.com/function/_get_meta_table
		$mq_sql = $metaq->get_sql( 'book', $wpdb->books, 'book_id' );

		$JOIN    = $mq_sql['join'];  // INNER JOIN my_bookmeta ON ( my_books.book_id = my_bookmeta.book_id )
		$WHERE[] = $mq_sql['where']; // AND ( ( my_bookmeta.meta_key = 'author_name' AND my_bookmeta.meta_value = 'Циркон' ) )
	}

	$WHERE = 'WHERE '. implode( ' AND ', $WHERE );

	/*
	To sort by meta-fields you will need $metaq->get_clauses()
	Array(
		[metasort] => Array(
				[key]     => author_name
				[value]   => Циркон
				[compare] => =
				[alias]   => my_bookmeta
				[cast]    => CHAR
			)
	)
	for an example, see in: https://wp-kama.com/function/WP_Query::parse_orderby
	*/
	$ORDER_BY = 'ORDER BY name ASC';

	$fields = '*';

	if( isset( $metaq ) && $metaq->has_or_relation() ){
		$fields = "DISTINCT $fields";
	}

	$res = $wpdb->get_results(
		"SELECT $fields FROM $wpdb->books $JOIN $WHERE $ORDER_BY $LIMIT"
	);

	return $res;
}

Let's check the function, let's make a query:

// book request
$books = get_books([
	'meta_key'   => 'author_name',
	'meta_value' => 'Zircon',
]);

// or like this
$books = get_books([
	'meta_query' =>[
		'metasort' => [
			'key'   => 'author_name',
			'value' => 'Zircon',
		]
	]
]);

print_r( $books );
/*
We get it:
Array(
		[0] => stdClass Object(
			[book_id] => 12
			[name] => The Cherry Orchard
			[content] => Contents of the book ...
			[meta_id] => 2
			[meta_key] => author_name
			[meta_value] => Zircon
		)
)
*/

I repeat, now you can do selection by meta-fields of any complexity, all that allows you to do WP_Meta_Query{}, the same as meta_query in WP_Query.

An example of a more complicated selection:

$args = array(
	'meta_query'   => array(
		'relation' => 'AND',
		array(
			'key'     => 'author_name',
			'value'   => 'алекс',
			'compare' => 'LIKE'
		),
		array(
			'key'     => 'price',
			'value'   => array( 20, 100 ),
			'type'    => 'numeric',
			'compare' => 'BETWEEN'
		)
	)
);
$books = get_books( $args );

Performance and metadata

Queries based on meta_query provide huge opportunities, but it all has a downside - not the best performance, especially if you make complex queries and processed a lot of data.

The weak point in the metadata is the meta_value field of any metadata table (such as the wp_postmeta table). The meta_value does not and cannot have an index, because the field is of LONGTEXT type, so you can store any data in it: numbers, texts of any length, serialized arrays, etc. Also, indexing is not possible because when building a query, such as sorting by meta-field where numbers are stored, the values are converted from strings to numbers by CAST() and only then sorted. This approach would "kill" the index, even if it were there.

Because of the disadvantages described above, it is not always correct to use meta-fields to store your data that will then be selected or sorted, although they are fine in most cases.

Always test queries under real-world conditions, looking at the execution time and how that time varies with the amount of data and query parameters. If queries have become slow, it is time to cache them somehow or connect an external indexing system such as Sphinx or Elasticsearch.

If you expect to store large amounts of data which need to be selected and sorted, you may want to create a special table for such data. By large volumes of data I mean the number of values of one meta-key from 20000 and more. With such amount of data in meta-fields queries will already slow down noticeably and you will not be able to do without caching (which is not always suitable) or other addons.

How enable the Custom Fields metabox in the Gutenberg?