Validating, Sanitizing, and Escaping User Data in WordPress Development

“Never trust data provided by users” is a golden rule in programming. A secure WordPress theme or plugin, or any other Web program, needs to achieve at least the following three points to basically guarantee the security of the user data workflow.

  • Before processing user input, we need to validate the data provided by the user to ensure that the format of the data provided by the user meets our requirements.
  • Before saving the data entered by the user into the database, perform a sanitizing operation on the data to prevent some unexpected data from being saved in the database and causing bugs or security files.
  • After taking the data out of the database and outputting it to the front end, we also need to perform an escaping operation on the data to prevent some unexpected characters from causing layout confusion.

Validating User Data Format Before Processing

Validating the data format is to ensure that the data submitted by the user is consistent with what we need. WordPress provides several validation methods to help us perform data format validation. Which specific one to use depends on the type of data we need to validate.

For example, suppose our form contains a text field:

<input type="text" id="zipcode" name ="zipcode" maxlength ="5"/>

In the above field, we have already used the “maxlength” attribute to limit the user to inputting a maximum of 5 characters, but it doesn’t limit what type of characters the user can input. The user can input “12345” or “abcde”, but zip codes are usually numeric data. If the user inputs non-numeric characters, it obviously does not meet our requirements.

At this time, when validating user data, when processing the form, we need to check that each field submitted by the user is in the format we need. In this example, we can use the following code to validate the “zipcode” field:

$safe_zipcode = intval($_POST ['zipcode']);
if(!$safe_zipcode){
  $safe_zipcode = '';
}

if(strlen($safe_zipcode) > 5){
  $safe_zipcode = substr($safe_zipcode, 0, 5);
}

update_post_meta($post->ID, 'zipcode', $safe_zipcode);

The “maxlength” attribute of the form is only checked by the browser, but some browsers do not support this attribute, and users can also bypass the browser check and manually input longer characters. Therefore, even if the front end performs a check, we still need to perform data check on the server.

The intval function can force the data input by the user to be converted into an integer. If the value input by the user is not of integer type, it will be converted to 0, and then we check whether the value is 0 to know whether the data input by the user is valid.

This validation style is closest to WordPress’s Whitelisting concept: only allow user input that you expect. WordPress provides many convenient auxiliary functions to help us handle most data types.

Sanitizing User Data Before Saving to the Database

Compared with data validation above, sanitizing user data is more flexible. When our requirements for the format of user-input data are not so strict, we can call these methods to sanitize user data.

For example, if we have such a field in our form:

<input type="text" id="title" name="title" />

We can use the sanitize_text_field() function to clear characters in user input that do not meet requirements.

$title = sanitize_text_field( $_POST['title'] );
update_post_meta( $post->ID, 'title', $title );

This function silently does a lot of things for us behind the scenes, roughly as follows:

  • Check for invalid UTF-8 (using wp_check_invalid_utf8 function)
  • Convert single < characters to HTML entities
  • Remove all tags
  • Remove line breaks, tags, and extra whitespace areas
  • Remove octet characters

In addition to sanitize_text_field, we can also use the following functions to sanitize data provided by users.

Escaping User Data During Output

To prevent problems caused by outputting invalid data, we need to perform an escaping operation on the data we need to output when outputting data provided by users. WordPress provides several escaping functions to help us escape the following types of data.

esc_html() When using HTML to wrap the data we need to output, this function should be used to escape the data to prevent HTML forms in the data from damaging the HTML structure and causing layout confusion. The usage method is as follows:

<h4><?php echo esc_html( $title ); ?></h4>

esc_url() When we need to output a URL string, this function should be used to escape the data, such as URLs for src and href attributes.

<img src="<?php echo esc_url( $great_user_picture_url ); ?>" />

esc_js() Used for escaping inline JavaScript code, as follows:

<a href="#" onclick="<?php echo esc_js( $custom_js ); ?>">Click me</a>

esc_attr() When we need to output user data as an attribute value of an HTML element, we can use this function to escape user data. As follows:

<ul class="<?php echo esc_attr( $stored_class ); ?>">

esc_textarea() Escapes text for use in a textarea element.

<textarea><?php echo esc_textarea( $text ); ?></textarea>
Note: Most WordPress functions have performed correct escaping operations before outputting data, so we don’t need to escape them again. For example: <h4><?php the_title(); ?></h4>

Whether user data is appropriately handled is a basic standard for judging whether a WordPress theme or plugin is excellent. When developing WordPress themes and plugins, wherever user data needs to be handled, we must consider using validation, sanitizing, and escaping operations to ensure security. This can minimize plugin bugs and security vulnerabilities, and help develop excellent and stable themes or plugins.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *