One of the most important issues for language extensions is accepting and dealing with data passed via arguments. Most extensions are built to deal with specific input data (or require parameters to perform their specific actions), and function arguments are the only real way to exchange data between the PHP level and the C level. Of course, there's also the possibility of exchanging data using predefined global values (which is also discussed later), but this should be avoided by all means, as it's extremely bad practice.
PHP doesn't make use of any formal function declarations; this is why call syntax is always completely dynamic and never checked for errors. Checking for correct call syntax is left to the user code. For example, it's possible to call a function using only one argument at one time and four arguments the next time - both invocations are syntactically absolutely correct.
Since PHP doesn't have formal function definitions with support for call syntax checking, and since PHP features variable arguments, sometimes you need to find out with how many arguments your function has been called. You can use the ZEND_NUM_ARGS macro in this case. In previous versions of PHP, this macro retrieved the number of arguments with which the function has been called based on the function's hash table entry, ht, which is passed in the INTERNAL_FUNCTION_PARAMETERS list. As ht itself now contains the number of arguments that have been passed to the function, ZEND_NUM_ARGS has been stripped down to a dummy macro (see its definition in zend_API.h). But it's still good practice to use it, to remain compatible with future changes in the call interface. Note: The old PHP equivalent of this macro is ARG_COUNT.
The following code checks for the correct number of arguments:
if(ZEND_NUM_ARGS() != 2) WRONG_PARAM_COUNT; |
This macro prints a default error message and then returns to the caller. Its definition can also be found in zend_API.h and looks like this:
ZEND_API void wrong_param_count(void); #define WRONG_PARAM_COUNT { wrong_param_count(); return; } |
New parameter parsing API: This chapter documents the new Zend parameter parsing API introduced by Andrei Zmievski. It was introduced in the development stage between PHP 4.0.6 and 4.1.0 .
Parsing parameters is a very common operation and it may get a bit tedious. It would also be nice to have standardized error checking and error messages. Since PHP 4.1.0, there is a way to do just that by using the new parameter parsing API. It greatly simplifies the process of receiving parameters, but it has a drawback in that it can't be used for functions that expect variable number of parameters. But since the vast majority of functions do not fall into those categories, this parsing API is recommended as the new standard way.
The prototype for parameter parsing function looks like this:
int zend_parse_parameters(int num_args TSRMLS_DC, char *type_spec, ...); |
zend_parse_parameters() also performs type conversions whenever possible, so that you always receive the data in the format you asked for. Any type of scalar can be converted to another one, but conversions between complex types (arrays, objects, and resources) and scalar types are not allowed.
If the parameters could be obtained successfully and there were no errors during type conversion, the function will return SUCCESS, otherwise it will return FAILURE. The function will output informative error messages, if the number of received parameters does not match the requested number, or if type conversion could not be performed.
Here are some sample error messages:
Warning - ini_get_all() requires at most 1 parameter, 2 given Warning - wddx_deserialize() expects parameter 1 to be string, array given |
Here is the full list of type specifiers:
l - long
d - double
s - string (with possible null bytes) and its length
b - boolean
r - resource, stored in zval*
a - array, stored in zval*
o - object (of any class), stored in zval*
O - object (of class specified by class entry), stored in zval*
z - the actual zval*
| - indicates that the remaining parameters are optional. The storage variables corresponding to these parameters should be initialized to default values by the extension, since they will not be touched by the parsing function if the parameters are not passed.
/ - the parsing function will call SEPARATE_ZVAL_IF_NOT_REF() on the parameter it follows, to provide a copy of the parameter, unless it's a reference.
! - the parameter it follows can be of specified type or NULL (only applies to a, o, O, r, and z). If NULL value is passed by the user, the storage pointer will be set to NULL.
The best way to illustrate the usage of this function is through examples:
/* Gets a long, a string and its length, and a zval. */ long l; char *s; int s_len; zval *param; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "lsz", &l, &s, &s_len, ¶m) == FAILURE) { return; } /* Gets an object of class specified by my_ce, and an optional double. */ zval *obj; double d = 0.5; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O|d", &obj, my_ce, &d) == FAILURE) { return; } /* Gets an object or null, and an array. If null is passed for object, obj will be set to NULL. */ zval *obj; zval *arr; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "O!a", &obj, &arr) == FAILURE) { return; } /* Gets a separated array. */ zval *arr; if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "a/", &arr) == FAILURE) { return; } /* Get only the first three parameters (useful for varargs functions). */ zval *z; zend_bool b; zval *r; if (zend_parse_parameters(3, "zbr!", &z, &b, &r) == FAILURE) { return; } |
Note that in the last example we pass 3 for the number of received parameters, instead of ZEND_NUM_ARGS(). What this lets us do is receive the least number of parameters if our function expects a variable number of them. Of course, if you want to operate on the rest of the parameters, you will have to use zend_get_parameters_array_ex() to obtain them.
The parsing function has an extended version that allows for an additional flags argument that controls its actions.
int zend_parse_parameters_ex(int flags, int num_args TSRMLS_DC, char *type_spec, ...); |
The only flag you can pass currently is ZEND_PARSE_PARAMS_QUIET, which instructs the function to not output any error messages during its operation. This is useful for functions that expect several sets of completely different arguments, but you will have to output your own error messages.
For example, here is how you would get either a set of three longs or a string:
long l1, l2, l3; char *s; if (zend_parse_parameters_ex(ZEND_PARSE_PARAMS_QUIET, ZEND_NUM_ARGS() TSRMLS_CC, "lll", &l1, &l2, &l3) == SUCCESS) { /* manipulate longs */ } else if (zend_parse_parameters_ex(ZEND_PARSE_PARAMS_QUIET, ZEND_NUM_ARGS(), "s", &s, &s_len) == SUCCESS) { /* manipulate string */ } else { php_error(E_WARNING, "%s() takes either three long values or a string as argument", get_active_function_name(TSRMLS_C)); return; } |
With all the abovementioned ways of receiving function parameters you should have a good handle on this process. For even more example, look through the source code for extensions that are shipped with PHP - they illustrate every conceivable situation.
Deprecated parameter parsing API: This API is deprecated and superseded by the new ZEND parameter parsing API.
After having checked the number of arguments, you need to get access to the arguments themselves. This is done with the help of zend_get_parameters_ex():
zval **parameter; if(zend_get_parameters_ex(1, ¶meter) != SUCCESS) WRONG_PARAM_COUNT; |
zend_get_parameters_ex() accepts at least two arguments. The first argument is the number of arguments to retrieve (which should match the number of arguments with which the function has been called; this is why it's important to check for correct call syntax). The second argument (and all following arguments) are pointers to pointers to pointers to zvals. (Confusing, isn't it?) All these pointers are required because Zend works internally with **zval; to adjust a local **zval in our function,zend_get_parameters_ex() requires a pointer to it.
The return value of zend_get_parameters_ex() can either be SUCCESS or FAILURE, indicating (unsurprisingly) success or failure of the argument processing. A failure is most likely related to an incorrect number of arguments being specified, in which case you should exit with WRONG_PARAM_COUNT.
To retrieve more than one argument, you can use a similar snippet:
zval **param1, **param2, **param3, **param4; if(zend_get_parameters_ex(4, ¶m1, ¶m2, ¶m3, ¶m4) != SUCCESS) WRONG_PARAM_COUNT; |
zend_get_parameters_ex() only checks whether you're trying to retrieve too many parameters. If the function is called with five arguments, but you're only retrieving three of them with zend_get_parameters_ex(), you won't get an error but will get the first three parameters instead. Subsequent calls of zend_get_parameters_ex() won't retrieve the remaining arguments, but will get the same arguments again.
If your function is meant to accept a variable number of arguments, the snippets just described are sometimes suboptimal solutions. You have to create a line calling zend_get_parameters_ex() for every possible number of arguments, which is often unsatisfying.
For this case, you can use the function zend_get_parameters_array_ex(), which accepts the number of parameters to retrieve and an array in which to store them:
zval **parameter_array[4]; /* get the number of arguments */ argument_count = ZEND_NUM_ARGS(); /* see if it satisfies our minimal request (2 arguments) */ /* and our maximal acceptance (4 arguments) */ if(argument_count < 2 || argument_count > 4) WRONG_PARAM_COUNT; /* argument count is correct, now retrieve arguments */ if(zend_get_parameters_array_ex(argument_count, parameter_array) != SUCCESS) WRONG_PARAM_COUNT; |
A very clever implementation of this can be found in the code handling PHP's fsockopen() located in ext/standard/fsock.c, as shown in Example 46-6. Don't worry if you don't know all the functions used in this source yet; we'll get to them shortly.
fsockopen() accepts two, three, four, or five parameters. After the obligatory variable declarations, the function checks for the correct range of arguments. Then it uses a fall-through mechanism in a switch() statement to deal with all arguments. The switch() statement starts with the maximum number of arguments being passed (five). After that, it automatically processes the case of four arguments being passed, then three, by omitting the otherwise obligatory break keyword in all stages. After having processed the last case, it exits the switch() statement and does the minimal argument processing needed if the function is invoked with only two arguments.
This multiple-stage type of processing, similar to a stairway, allows convenient processing of a variable number of arguments.
To access arguments, it's necessary for each argument to have a clearly defined type. Again, PHP's extremely dynamic nature introduces some quirks. Because PHP never does any kind of type checking, it's possible for a caller to pass any kind of data to your functions, whether you want it or not. If you expect an integer, for example, the caller might pass an array, and vice versa - PHP simply won't notice.
To work around this, you have to use a set of API functions to force a type conversion on every argument that's being passed (see Table 46-4).
Note: All conversion functions expect a **zval as parameter.
Table 46-4. Argument Conversion Functions
Function | Description |
convert_to_boolean_ex() | Forces conversion to a Boolean type. Boolean values remain untouched. Longs, doubles, and strings containing 0 as well as NULL values will result in Boolean 0 (FALSE). Arrays and objects are converted based on the number of entries or properties, respectively, that they have. Empty arrays and objects are converted to FALSE; otherwise, to TRUE. All other values result in a Boolean 1 (TRUE). |
convert_to_long_ex() | Forces conversion to a long, the default integer type. NULL values, Booleans, resources, and of course longs remain untouched. Doubles are truncated. Strings containing an integer are converted to their corresponding numeric representation, otherwise resulting in 0. Arrays and objects are converted to 0 if empty, 1 otherwise. |
convert_to_double_ex() | Forces conversion to a double, the default floating-point type. NULL values, Booleans, resources, longs, and of course doubles remain untouched. Strings containing a number are converted to their corresponding numeric representation, otherwise resulting in 0.0. Arrays and objects are converted to 0.0 if empty, 1.0 otherwise. |
convert_to_string_ex() | Forces conversion to a string. Strings remain untouched. NULL values are converted to an empty string. Booleans containing TRUE are converted to "1", otherwise resulting in an empty string. Longs and doubles are converted to their corresponding string representation. Arrays are converted to the string "Array" and objects to the string "Object". |
convert_to_array_ex(value) | Forces conversion to an array. Arrays remain untouched. Objects are converted to an array by assigning all their properties to the array table. All property names are used as keys, property contents as values. NULL values are converted to an empty array. All other values are converted to an array that contains the specific source value in the element with the key 0. |
convert_to_object_ex(value) | Forces conversion to an object. Objects remain untouched. NULL values are converted to an empty object. Arrays are converted to objects by introducing their keys as properties into the objects and their values as corresponding property contents in the object. All other types result in an object with the property scalar , having the corresponding source value as content. |
convert_to_null_ex(value) | Forces the type to become a NULL value, meaning empty. |
Note: You can find a demonstration of the behavior in cross_conversion.php on the accompanying CD-ROM. Figure 46-2 shows the output.
Using these functions on your arguments will ensure type safety for all data that's passed to you. If the supplied type doesn't match the required type, PHP forces dummy contents on the resulting value (empty strings, arrays, or objects, 0 for numeric values, FALSE for Booleans) to ensure a defined state.
Following is a quote from the sample module discussed previously, which makes use of the conversion functions:
zval **parameter; if((ZEND_NUM_ARGS() != 1) || (zend_get_parameters_ex(1, ¶meter) != SUCCESS)) { WRONG_PARAM_COUNT; } convert_to_long_ex(parameter); RETURN_LONG(Z_LVAL_P(parameter)); |
Example 46-7. PHP/Zend zval type definition.
|
Actually, pval (defined in php.h) is only an alias of zval (defined in zend.h), which in turn refers to _zval_struct. This is a most interesting structure. _zval_struct is the "master" structure, containing the value structure, type, and reference information. The substructure zvalue_value is a union that contains the variable's contents. Depending on the variable's type, you'll have to access different members of this union. For a description of both structures, see Table 46-5, Table 46-6 and Table 46-7.
Table 46-5. Zend zval Structure
Entry | Description |
value | Union containing this variable's contents. See Table 46-6 for a description. |
type | Contains this variable's type. For a list of available types, see Table 46-7. |
is_ref | 0 means that this variable is not a reference; 1 means that this variable is a reference to another variable. |
refcount | The number of references that exist for this variable. For every new reference to the value stored in this variable, this counter is increased by 1. For every lost reference, this counter is decreased by 1. When the reference counter reaches 0, no references exist to this value anymore, which causes automatic freeing of the value. |
Table 46-6. Zend zvalue_value Structure
Entry | Description |
lval | Use this property if the variable is of the type IS_LONG, IS_BOOLEAN, or IS_RESOURCE. |
dval | Use this property if the variable is of the type IS_DOUBLE. |
str | This structure can be used to access variables of the type IS_STRING. The member len contains the string length; the member val points to the string itself. Zend uses C strings; thus, the string length contains a trailing 0x00. |
ht | This entry points to the variable's hash table entry if the variable is an array. |
obj | Use this property if the variable is of the type IS_OBJECT. |
Table 46-7. Zend Variable Type Constants
Constant | Description |
IS_NULL | Denotes a NULL (empty) value. |
IS_LONG | A long (integer) value. |
IS_DOUBLE | A double (floating point) value. |
IS_STRING | A string. |
IS_ARRAY | Denotes an array. |
IS_OBJECT | An object. |
IS_BOOL | A Boolean value. |
IS_RESOURCE | A resource (for a discussion of resources, see the appropriate section below). |
IS_CONSTANT | A constant (defined) value. |
To access a long you access zval.value.lval, to access a double you use zval.value.dval, and so on. Because all values are stored in a union, trying to access data with incorrect union members results in meaningless output.
Accessing arrays and objects is a bit more complicated and is discussed later.
If your function accepts arguments passed by reference that you intend to modify, you need to take some precautions.
What we didn't say yet is that under the circumstances presented so far, you don't have write access to any zval containers designating function parameters that have been passed to you. Of course, you can change any zval containers that you created within your function, but you mustn't change any zvals that refer to Zend-internal data!
We've only discussed the so-called *_ex() API so far. You may have noticed that the API functions we've used are called zend_get_parameters_ex() instead of zend_get_parameters(), convert_to_long_ex() instead of convert_to_long(), etc. The *_ex() functions form the so-called new "extended" Zend API. They give a minor speed increase over the old API, but as a tradeoff are only meant for providing read-only access.
Because Zend works internally with references, different variables may reference the same value. Write access to a zval container requires this container to contain an isolated value, meaning a value that's not referenced by any other containers. If a zval container were referenced by other containers and you changed the referenced zval, you would automatically change the contents of the other containers referencing this zval (because they'd simply point to the changed value and thus change their own value as well).
zend_get_parameters_ex() doesn't care about this situation, but simply returns a pointer to the desired zval containers, whether they consist of references or not. Its corresponding function in the traditional API, zend_get_parameters(), immediately checks for referenced values. If it finds a reference, it creates a new, isolated zval container; copies the referenced data into this newly allocated space; and then returns a pointer to the new, isolated value.
This action is called zval separation (or pval separation). Because the *_ex() API doesn't perform zval separation, it's considerably faster, while at the same time disabling write access.
To change parameters, however, write access is required. Zend deals with this situation in a special way: Whenever a parameter to a function is passed by reference, it performs automatic zval separation. This means that whenever you're calling a function like this in PHP, Zend will automatically ensure that $parameter is being passed as an isolated value, rendering it to a write-safe state:
my_function(&$parameter); |
But this is not the case with regular parameters! All other parameters that are not passed by reference are in a read-only state.
This requires you to make sure that you're really working with a reference - otherwise you might produce unwanted results. To check for a parameter being passed by reference, you can use the macro PZVAL_IS_REF. This macro accepts a zval* to check if it is a reference or not. Examples are given in in Example 46-8.
Example 46-8. Testing for referenced parameter passing.
|
You might run into a situation in which you need write access to a parameter that's retrieved with zend_get_parameters_ex() but not passed by reference. For this case, you can use the macro SEPARATE_ZVAL, which does a zval separation on the provided container. The newly generated zval is detached from internal data and has only a local scope, meaning that it can be changed or destroyed without implying global changes in the script context:
zval **parameter; /* retrieve parameter */ zend_get_parameters_ex(1, ¶meter); /* at this stage, <parameter> still is connected */ /* to Zend's internal data buffers */ /* make <parameter> write-safe */ SEPARATE_ZVAL(parameter); /* now we can safely modify <parameter> */ /* without implying global changes */ |
Note: As you can easily work around the lack of write access in the "traditional" API (with zend_get_parameters() and so on), this API seems to be obsolete, and is not discussed further in this chapter.