Archive | PHP Extensions RSS feed for this section

PHP Extensions: Understanding and working with hash API Part 1

10 May

Hash table is collection. It is specialized form of doubly linked list. As it’s heavily used in zend engine and PHP Core, so therefore an entire subset of API is devoted to it.

Hash table is important because

  1. All upper space variable are stored in hash table.
  2. Hash table can store any piece of data of any size

General syntax for initializing hash table is

zend_hash_init(HashTable *ht, uint nSize,

hash_func_t pHashFunction,

dtor_func_t pDestructor, zend_bool persistent);

Here

ht is pointer to HashTable variable.

nSize is maximum number of elements that HashTable is expected to hold. It will always be power of 2.

pHashFunction  is no longer used in new versions, so it must always be NULL.

pDestructor is pointer to function that is called whenever an element is removed such as when using zend_hash_del or zend_hash_update.

The prototype of the destructor must be

void method_name(void *pElement)

Where pElement is pointer to the element to be removed.

The final option, persistent is flag passed by zend engine to the pemalloc() function.

An example of initializing the hash table can be

zend_hash_init(&EG(symbol_tabel), 50, NULL, ZVAL_PTR_DTOR, 0);

Populating the hash table

The following functions are used to populate HashTable.

  1. int zend_hash_add(HashTable *ht, char *arrKey, uinit nKeySize, void *pData, uinit nDataSize, void **pDest)
  2. int zend_hash_update(HashTable *ht, char *arrKey, uinit nKeySize, void *pData, uinit nDataSize, void **pDest)
  3. int zend_hash_index_update(HashTable *ht, ulong h, void *pDate, uinit nDataSize, void **pDest)
  4. int zend_hash_next_index_insert(HashTable *ht, void *pData, uinit nDataSize, void **pDest)

Let’s have a look at some examples

If you want to represent $foo[“bar”] = “baz”; write

zval barValue;

MAKE_STD_ZVAL(barValue);

Z_TYPE_P(barValue) = IS_STRING;

Z_STRVAL_P(barValue) = “baz”;

S_STRLEN_P(barValue) = sizeof(‘baz’) + 1

zend_hash_add(fooHashTable, “bar”, sizeof(“bar”), &barValue, sizeof(zval *), NULL);

Some time you will need to find the free element before performing insert, update etc on the Zend table. In this case you can get the next free element by writing

ulong nextid = zend_hash_next_free_element(ht);

and now perform update as

zend_hash_upate(ht, nextid, &data, sizeof(data), NULL)

Finding data in HashTable

In the above paragraphs I discuss how to initialize and populate HashTable, Here I’d discuss how to find particular key/index value in the HashTable

Two methods are used.

One if you want to find value of particular key in case of associative array. The syntax as

int zend_hash_find(HashTable *ht, char *key, uint nKeySize, void **pData);

And the other function is used to find value at specified index in case of indexed array.

The general syntax of the method is

int zend_hash_find(HashTable *ht, ulong index, void **pData);

Another very important functionality Hash API provide is to check whether or not the specified key or index exists. Two method are used for this purpose.

zend_hash_exists(HashTable *ht, char *key, unit nKeySize);

This method is used to check the key in case of associative array.

While

zend_hash_exists(HashTabel *ht, ulong index);

The upper space function for checking the key is

isset($foo);

To achieve this using Hash API, use

zend_hash_exists(EG(active_symbol_table), “foo”, sizeof(“foo”));

This function will return true or false.

Iterating through HashTable

Hash API provides some useful functions for iterating through hash table.

  1. zend_hash_apply(HashTable *ht, (apply_func_t) apply func TSRMLS_DC);
  2. zend_hash_apply_with_argument(HashTable *ht, (apply_func_arg_t) apply_func void *data TSRLMS_DC);
  3. zend_hash_apply_with_arguments(HashTable *ht, (apply_func_args_t) apply_func, int numargs, …)

Each of the apply_func_t return the following

  1. ZEND_HASH_APPLY_KEEP
  2. ZEND_HASH_APPLY_STOP
  3. ZEND_HASH_APPLY_REMOVE

Beside these functions there are some other useful functions for working with HashTables.

Try these functions as well

  1. zend_get_hash_value(char *arrKey, uint nKeyLen);
  2. zend_hash_quick_add(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval void *pData, uint nDataSize, void **pDest);
  3. zend_hash_quick_update(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval void *pData, uint nDataSize, void **pDest);
  4. zend_hash_find(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval, void **pDest)
  5. zend_hash_quick_exists(HashTable *ht, char *arKey, uint nKeyLen, ulong hashval);

Destruction

There four method you need to keep in mind while removing element(s) from HashTable.

  1. zend_hash_del(HashTable *ht, char *nKey, uint nKeySize);
  2. zend_hash_index_del(HashTable *ht, ulong h)
  3. zend_hash_clean(HashTable *ht)
  4. zend_hash_destroy(HashTable *ht)

The first two method remove single element from the HashTable.

zend_hash_clean iterate through each element and clean the entire hash table.

zend_hash_destroy all the function of zend_hash_clean plus free the structure allocated during zend_hash_init.

While working with the hash table, you may need the following as well

HashTable *ht;

// allocate memory

ALLOC_HASHTABLE(ht);

// initialize it internal state

zend_hash_init(ht, 50, NULL, ZVAL_PTR_DTOR, 0);

// destroy the hash table

zend_hash_destroy(ht);

// free the hash table itself

FREE_HASHTABLE(ht);

PHP Extension: Working with arrays

10 May

What if you are unable to work with arrays while building PHP extensions?

It will be like you have missed half of the picture, I guess.

Before discussing how you can handle and create arrays when building PHP Extensions, I will discuss what are arrays and how are they used in c.

Array is block of memory.

Some time it is said that an array is pointer, which is not true in all cases. Array can act as pointer and pointer as an array, however they are different.

To declare an array in c, simply write

int num[5];

This num now represent block of memory. The size id 5, so it take 20 bytes, keep in mind that an integer hold 4 bytes.

To assign value to first index of the array, write

*num = 10;

Although this can be done, however this is usually achieved by writing

num[0] = 10;

The above two statements are identical. However one is done using pointers and the other using array index.

To give value to second index, write

*(num+1) = 20;

or

num[1] = 20;

Let’s take our discussion a bit further.

To have an access to array address, we can simply do like this

int *p;

p = num;

First we declare an integer pointer, and then assign it an array. It means that the integer pointer p hold the memory address of the first element of array num.

This can also be achieved by writing

p = & num[0];

p now hold the address of the first element of the array.

Okay, while taking about an array and pointers earlier, I said they are a bit different although they can act as each other. Consider the following example;

double num[10];

double *dp = num;

If you want to access the fourth element of the array, write

dp[4];

This is similar to

num[4];

With pointer, however you can done this

dp++;

it means, move to the next memory address. However

num++;

can not be done, because num is an array not a pointer. This will not take you to the next memory address.

Well, when it come to passing an array as argument to a function, this can be achieved using the following methods in c.

long n[100];

f(n);

void f(long *lp)

{

}

In the first function we define it as taking the address of the array using pointer. Changing the values of the pointer location will modify the original array.

void f(long l[100])

{

}

In this example we take array as an array. This array is now local to the function, so changing it doesn’t cause the original array to be modified.

Although you can define the size of the array, however better approach is to leave it empty like

void f(long l[])

{

}

The compiler will keep track of the size of the array.

Now let’s have a look at how you can better use array while building extensions.

Array can easily be passed as parameters and returned from function.

Look at the following code which build and return an array.

PHP_FUNCTION(return_array)

{

zval *subarr;

init_array(return_value);

add_next_index_null(return_value);

add_next_index_long(return_value, 42);

add_next_index_string(return_value, “hello”,1);

add_next_index_double(return_value, 3.1415);

add_assoc_string(return_value, “name”,”faheem”,1);

add_index_long(return_value, 100, 33);

// to add sub array to the array

MAKE_STD_ZVAL(subarr);

init_array(subarr);

add_next_index_long(subarr, 45);

add_next_index_string(subarr, “sub array”,1);

add_next_index_zval(return_array, subarr);

}

While working with the array, you will first need to initialize an array as

init_array(array_name);

Once array is initialized, you can add values to it. Several methods are used for this purpose.

add_next_index_*() method is used to add value to next available method.

add_index_*() is used to add value to given index.

add_assoc_* is used to add value to key. This is used for creating an associated value.

PHP Extensions: understanding preprocessors

10 May

In my first example, where I discuss building a simple extension, you can see lots of #defined statements. These are preprocessor macros. So before going into details of building extension, it will be worth understanding these.

C’s preprocessor is text editor phase in compilation process. It loads up the commands and uses them to modify the source code. As all this is happened before the compiler is involved, you can use anything even C keywords as macro for macros definition. The only keyword that cannot be used as macro is defined.

Macros can be define either object like or function like.

Object like macros can be defined as

#defined NULL 0

#defined NULL (void*) 0

The type is mostly used for constants that may change later in the code. E.g

#defined HEIGHT 67

#defined WIDTH 70

These can also be used to defined expressions like

#defined HALFHEIGHT HEIGHT/2

Here you can see that we have defined macro by using already defined macro.

Function like macros have parenthesis, an example can be

#defined max(a,b) (a>b?a:b)

Later in the code if you write max(6,4). This will be substituted to (6>4?6:4).

You can even write max(,7)

However if you write max(k+1,4), in this case this will not be substituted to what you expect. To achieve this you will need to put some extra parenthesis like the following.

#defined max(a,b) ((a)>(b) ?(a):(b))

Keep in mind that using sting as macros is a bit tricky. Take a look at the following example.

#defined plot(m) me “me”

And later you write

plot(zend)

This will be substituted to

zend “me”

you can examine that first one is substituted while the string in the quotes doesn’t.

To properly defined string in quotes, write

#defined p(string) printf(“%s”,#string)

Now when you use

p(“hello”)

This will be substituted to

printf(“%s”,”hello”);

Because of using # before string, the string  will be placed in the code when substituted.

Keep in mind that anything enclose in the double quotes is not substituted.

Another important thing is that if you defined

#defined mo(cat) “Error:”#cat

And later write

printf(“%s”, mo(no such cat));

This will be substituted to

printf(“%s”,”Error:no such cat”);

If you want to concatenate strings, removing spaces and pound signs, write

#defined concat(a,b) a##b

concat(2,1) will yield 21.

Well, in my first extension building example if does defined macros, however most of them were defined on conditions.

So let have a look at how to defined macros based on condition.

The general syntax is

#if expression

#endif

If the expression isn’t true means zero “0” then the code after the if condition is not include, otherwise its included.

A basic example can be

#if MAX > 80

#defined MIN 100

#endif

Another example can be

#defined MAX

#if defined MAX

And similarly

#if !defined MAX

Anther form can also be used

#ifdef MAX

And

#ifndef MAX

Another very good example would be usage of if-elif like this

#if expression1

….

#elif expression2

….

#elif expression3

#else

….

#endif

This example states that if the first expression is true, include what follow the expression1, if not check the second expression. Check all the expression. If non of them is true, include the what follows the else part.

What if you have defined a macro and you want to remove that?

Well, this can be achieved by using

#undef

macro.

Like

#defined MAX 200

#undef MAX

Well, when it comes to c preprocessor there are predefined macros like

__FILE__

__LINE__

__DATE__

__TIME__

__STDC__

__STD_HOSTED__

__STD_VERSION__

__func__

PHP Extensions: Understanding memory allocation

10 May

One of the nice features of c is that it allows you to allocate memory to the variables you use. It will not be a good practice to waste memory while handling several thousand or even million of request.

So before building PHP Extensions it will be better to understand how c take care of memory allocation.

Malloc

In order to use this function you will need to include stdlib.h in your program.

This function returns either memory address allocated or null in case no memory allocated.

An example can be

include “stdlib.h”

void main(void)

{

char *p;

p = malloc(500);

}

This example assigns 500 bytes of memory to pointer p. Here pointer can be of any pointer. The only exception come with malloc is null memory allocation.

Calloc

Another very useful function for allocating memory. This allocate memory similar to the malloc, however it is different from the malloc.

An example can be

char *p;

p = calloc(40, sizeof(char));

It takes two arguments.

  1. The number of element
  2. size of each element

Realloc

Another very usefull function.

An example can be

char *p;

p = realloc(NULL, 500);

Realloc is used to reallocate memory. If null is passed as first argument, realloc work as malloc.

char *pt;

pt = realloc(pt, 500);

this however work differently. Either allocate the memory previously allocated or give new memory, however in case new memory is allocated all the information from the old block is copied to the new location. A question come to front, why it assign new memory location?

Well, it depends. When you assign more memory allocation then it previously hold, and the memory at that location is not enough it will assign it new memory.

char *pt;

pt = realloc(pt, 0);

If zero is given, it frees the memory previously allocated and return a null pointer.

While working with memory allocation function, it is your job to free the memory allocated.

You can use free method. If you don’t free memory, a bug called memory leak can jumps out.

char *pt;

pt = malloc(500);

free(pt);

PHP Extensions: Accepting parameters.

10 May

While building extension, you will be writing functions. Those may be standalone functions or function defined in the classes. Whatever is the case, most of the functions accept parameters. In c we can pass parameters as

void sum(int a, int b)

{

printf(“a + b = %d”, a+b);

}

The above method accept two parameters a and b and print the sum of these values.

While building the php extension, the above function will look like this.

PHP_FUNCTION(sum)

{

long a, b;

if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC,”ll”, &a, &b) == FAILURE)

{

RETURN_NULL();

}

php_printf(“a + b = %d”, a + b);

}

Here zend_parse_paremeters() function take care of the values passed. If the values aren’t successfully passed the function will return null. Otherwise print sum.

However passing sting as parameter will take a bit more core then passing integer and float values.

The function for accepting sting as parameter will look like as

PHP_FUNCTION(handle_string)

{

char *name;

long name_len;

if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “sl”, &name, &name_len) == FAILURE)

{

RETURN_NULL();

}

PHPWRITE(name, name_len);

}

Methods can accept optional parameters. Like

void printme(char[] name, int len = 9)

{

}

While call this method, you must pass array of characters. This method can accept an optional parameter len.

If len is not provided, it is set to 9.

To define a function accepting optional parameters while building extension, you will need to write the following code.

PHP_FUNCTION(prinntme)

{

zval *arr;

long num;

if(zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, “a|l”,&arr, &num) == FAILURE)

{

RETURN_NULL();

}

// code goes here

}

Here “a|l” define that the first argument is compulsory while the second parameters is optional.

PHP Extensions: returning values

10 May

When it comes to functions and methods, its important that they return value(s);

Method have return statement which is executed at the end. The code that is placed after the return statement may never execute.

While building extension, you can write return statement to return value from the method. One important thing zend provide is return_value, that is available when a function is called.

In c if you want to return 42. it will be as simple as this

return 42;

However zend provided different approaches to achieve this.

This can be achieved as

PHP_FUNCTION(return_value)

{

zval *val;

MAKE_STD_ZVAL(val);

ZVAL_LONG(val, 42);

return val;

}

Another very simple approach may be

PHP_FUNCTION(return_value)

{

ZVAL_LONG(return_value, 42);

return;

}

Or

PHP_FUNCTION(return_value)

{

Z_TYPE_P(return_value) = IS_LONG;

Z_LVAL_P(return_value) = 42;

return;

}

Or

PHP_FUNCTION(return_value)

{

return_value->type = IS_LONG;

return_value->value.lval = 42;

return;

}

However this isn’t good approach to be used. You can use another approach as

PHP_FUNCTION(return_value)

{

RETVAL_LONG(42);

Return;

}

PHP Extensions: Introduction to data types

10 May

zval is the fundamental type of data storage in zend. It’s a small struct having four member defined in Zend/zend.h

The format of the structure is

typedef struct _zval_struct {

zvalue_value value;

zend_uint refcount;

zend_uchar type;

zend_uchar is_ref;

}zval;

Here refcount is of type unsigned int.

type and is_ref are both unsigned char.

However value is union structure. And has format as

typedef union _zvalue_value {

long lval;

double dval;

struct {

char *val;

int len;

}

HashTable *ht;

zend_object_value obj;

}zvalue_value;

Zend has currently eight data type.

  1. IS_BOOL: can contain true or false.
  2. IS_LONG : used for integer data.
  3. IS_DOUBLE: for floating point numbers.
  4. IS_STRING: for character data.
  5. IS_ARRAY: for consecutive value. This can contain complex set of data buckets.
  6. IS_OBJECT: it take array one level forward by specify access modifiers, methods, and special events.
  7. IS_RESOURCE: values or pointer that cannot be handle through scalar array values, then this type is used.

Another important thing is

IS_NULL: used to check whether or not the variable contain value.

Data values

As all the variable in the zend can be represented by zval. We can inspect zval through several macros.

If we have

zval val;

we can check its type as

Z_TYPE(val);

An example could be

if (Z_TYPE(val) == IS_BOOL) {

// bool value.

} else if(Z_TYPE(val) == IS_LONG) {

// integer value

}

If we have a pointer, pointing to specific value, then the macro Z_TYPE changes to Z_TYPE_P.

Consider the following example

zval *val;

Z_TYPE_P(val);

And the example above become.

if(Z_TYPE_P(val) == IS_BOOL) {

// bool value.

} else if(Z_TYPE_P(val) == IS_LONG) {

// integer value

}

And the scenario changed more, when we have pointer to pointer.

Like

zval **val;

Z_TYPE_P(val);

And the example above become.

if(Z_TYPE_PP(val) == IS_BOOL) {

// bool value.

} else if(Z_TYPE_PP(val) == IS_LONG) {

// integer value

}

Strings are handle more differently. For example if you want to print an integer value, write

printf(“%d”,val);

Well, for extension development it would be better to use another very brilliant method called php_printf();

you can use it like

php_printf(“%d”,val);

However from the above union structure, we can see that string has two attribute, strval, and strlen.

To print the string, use another function

PHPWRITE(Z_STRVAL_P(val), Z_STRLEN_P(val));

Data Creation

Creating variable need to have a memory allocated to it in the storage media. In c memory is allocated using different function like

malloc(sizeof(int))

However this will not be a good choice to use while building extensions.

Use another macro provided

MAKE_STD_ZVAL(zv);

This macro declares the variable but didn’t initialize a value. If you want your variable to be declared and initialized, use another macro.

ALLOC_INIT_ZVAL(zv);

This declare the a variable zv and assigned it IS_NULL;

In addition to these, zend exposed yet another macros for handing zval.

ZVAL_NULL(zv);

ZVAL_TRUE(zv);

ZVAL_FALSE(zv);

Another approach can also be used.

Z_TYPE_P(zv) = IS_LONG;

Z_LVAL_P(zv) = 42;

ZVAL_BOOL(zv, 1)

ZVAL_BOOL(zv, 0);

Or you can define as

Z_TYPE_P(zv) = IS_BOOL;

Z_LVAL_P(zv) = 42;

For string, this takes a bit more

Z_TYPE_P(zv) = IS_STRING;

Z_STRLEN_P(zv) = len;

Z_STRVAL = val;

Data Storage

When a variable is declared in uperspace, zend store it in symbol table, the global symbol table is created at the time of request initialization before the function in the RINIT functions are called and deleted at the time of resource shutdown.

Another very important approach is that when a new block is defined, zend create a separate symbol table, called active symbol table, for that block and store local variables in that symbol table.

When the code outside the method is executed global symbol is considered active.

To access the symbol table write

EG(symbol_table).

Similarly active symbol table can be accessed as

EG(active_symbol_table);

Data retrieval

How to retrieve value from symbol table and active symbol table?

The code look a bit complex, however I will explain it for you.

Lets we define as

$foo = “bar”;

we can retrieve its value as

zval **result;

if(zend_hash_find(EG(symbol_table), “foo”, sizeof(“foo”), (void**)&result) == FAILURE)

{

php_printf(“value not found”);

return;

} else {

*return_value = **result;

}

We gave zend_hash_find the following parameters.

symbol_table, it can be anything like array in which we want to find the value.

foo, is what we want to find.

sizeof(), is necessary to find how much space the variable we are finding hold.

And last but not the least is the name of the variable where the value, if found is stored.