Robert Eisele
Systems Engineer, Architect and DBA

PHP Hacking

It took me a while but here's a new toy. Today I publish my own PHP fork based on the PHP 5.3.6 code base with a few changes that make the everydays developer life more bearable. It includes some of the patches I've already published about 3 years ago, my defcon extension and also my infusion extension plus a good bunch of extra gimmickries.

In the MySQL-landscape, you can see that the server is forked again and again which result in a seperate project every time; such as Drizzle, MariaDB, OurDelta or the Percona server. I don't want to maintain my own PHP version, but it's fun to improve PHP's behaviour under the view of faster development and also faster execution.

Okay, get the source from github and see what has changed so far

Performance improvements

Hardcoded strlen() and count()

strlen() is both, often used and very slow at the same time. In various PHP performance instructions, you can read that isset() is much faster to determine if a string has at least a certain size. If you want to check the exact length, you end up with something like this: strlen($str) == 32 -> isset($str[31]) && !isset($str[32]). This is very ugly and hard to read. I added a new opcode for count() and strlen(), which results in up to 10 times faster function execution. A strlen() with a constant string like strlen("foo") is optimized away to a constant "3" at compile time; which is cool because this way more verbose code is not a problem anymore.

Hardcoded constants

The constants true, false and null are used also very often. Unfortunately, every usage of either of these constants invokes a constant lookup. There is no problem with this, constant lookups are fast, but I nevertheless implemented these constants directly in the parser to avoid these lookups.

Optimized smart strings

PHP makes use of the smart string library - an internal dynamic growing string library. I optimized the smart_str_append_long() function to add integer numbers much faster. I've also added a new function smart_str_append_const() to concatenate a smart string buffer with a constant string.

Time call optimization

For every time() call in PHP, there is also a time(NULL) call to the kernel - and a few more for internal handlers. I thought, that this is optimized away by using the SAPI layer. But the SAPI method to use a cached time(0) is implemented very spartan. So I removed all time(NULL) and time(0) calls with the internal SAPI handler and also implemented a SAPI hook for CGI/FCGI. You may know, that there is no chance of getting the time via CGI/FCGI. But I patched also lighttpd and nginx to send the time as RAW_TIME. There is a risk that this optimization breaks your script because you get an old cached time value if the script runs more than one second - for example inside of a daemon written in PHP. Thus, I've added a new ini variable called use_sapi_time to turn this optional optimization on.

strtr() table generation optimization

strtr() creates an internal lookup table to speed-up the character replacement. Unfortunately, gcc can not optimize this table generation away. So I hardcoded this table instead of beeing generated anew for each call.

Turn off $_REQUEST variable if it's not needed

Registering PHP's super globals consumes superfluous time. The $_REQUEST variable contains all request relevant parameters, but I never use it and use $_POST, $_COOKIE, $_GET directly under an OOP fashion instead. Anyway, I added "r" to the ini variable variables_order to make the filling ofthe $_REQUEST array optional.

New PHP functions

bool exists(mixed $var[, ...])

exists() is isset()'s little brother not taking null values into account. It just tests for the existence of variables and attributes. My old patch removed the null check of isset() but I wanted to keep backwards compatibility and added a new function language construct exists().

Example

if (exists($var, $var->attr)) {}

string str_random(int $len[, string $chars="0123...XYZ"]);

Generates a random string very quickly using the underlaying operating system.

Example

echo str_random(16);

int ob_fwrite(resource $fd[, int $len=0])

Writes the ob buffer to an opened file handle.

Example

ob_start();
$fd = fopen('/cache/site.txt', 'w+');
echo "This goes to the file";
ob_fwrite($fd);
ob_end_clean();

mixed timechop(int $time[, mixed $format=2, bool $is_array=false])

Chops a time into smaller pieces and returns it as formatted string or as array. The format is a mixed type and can be defined as integer as a number of entities or as string to define the units you want to get. The types for the selective mode are defined as:

k - decade
y - years
n - months
w - weeks
d - days
h - hours
m - minutes
s - seconds

The time value switches to the delta of the current time and the passed value if the value is too big and looks like a unix timestamp.

Example

var_dump(timechop(2392383, "ynwdms", true));

int xround(int $num)

Round to the next power of 10. This breaks down 10log(n) / log(10) by using a fast binary search.

Example

echo xround(2344);

double sigfig(double $num, int $figs)

Calculates the significant figures of a number.

Example

echo sigfig(123.34, 4);

int sgn(double $num)

Calculates the sign of a number.

Example

echo sgn(-0.23);

string strcut(string $str, int $num[, string $x='...'])

Cuts a string if it's longer then a max value and appends a given string. This function doesn't chop words.

Example

echo strcut("This is a very very very very long string which will be
truncated", 15);

string strcal(string $format, string $str[, int $len=-1])

String calibration to check, if the string is in a given format with a simple regexp format.

Example

if (strcal("a-z!", $str)) {}

string strical(string $format, string $str[, int $len=-1])

String calibration without care about upper and lower case.

Example

if (strical("a-z!", $str)) {}

string strmap(string $str, array $replace)

Brings a simple template parser to PHP. The idea comes from the C# printf() functionalitys.

Example

echo strmap("This is {first} and {second}", ["first" => "X",
"second" => "Y", "third" => "Z"]);

int bround(int $num, int $base)

Round to the next multiple of a certain base.

Example

echo bround(283, 5);

mixed bound(mixed $num, mixed $min[, mixed $max])

Limits a number to a specified lower min- and a upper max value

Example

echo bound(43, 22, 50);

Usability improvements

foreach() for strings

Writing parsers in PHP mostly result in for() + strlen() + substr() constructs. I modified foreach() to be able to loop through strings in order to get the characters and their index. This is prettier and also much faster then the previous method.

Example

// Simulating str_split($str)
$str = "PHP is cool!";
$arr = [];

foreach ($str as $k => $v) {
	$arr[$k] = $v;
}

Delete characters with strtr()

Deleting several characters from a string can cause multiple str_replace() calls. It's now possible to delete all characters at once using strtr().

Example

$demise = strtr("passion", "os", "");

Key implode

I need a list of the keys of an array very often. One way to do this is implode(array_keys($arr)), which is not that fast and looks not really nice. implode() now has a new parameter be able to return the keys instead of the value:

Example

$keys = implode(',', $arr, true);

Negative string offsets

What happens if you write $str[-5]? Right, you get a warning and the expression returns null. But why should we give it away? We could use negative string offsets in the same way as positive string offsets with the difference, that we start at the end of the string. So [0] is the first character of the string, [1] the second and [-1] is the last, [-2] the second last and so on. This is really intuitive, makes the code cleaner and avoid nasty strlen() baubles.

Example

$str[-1] == $str[strlen($str) - 1] == substr($str, -1, 1)

Binary numbers

In C# you can define binary numbers in a similar way you write hexadecimal numbers: 0x90. With this change you can define binary numbers with a 0b prefix like this: 0b01001. I don't know, if this feature is good for a common use, because, as you may know, there are after all only 10 persons who understand binary. But I use bit sets very often and this is a good and fast way to do this.

Example

0b101 << 1 == 0b1010

Short array

Programmers are such a lazy folks and writing array() is really annoying. Here is an attempt to make this more handy.

Example

$arr = [1, 2, [5 => "foo", 3.14159], 9];

Better chr() handling

Converting ascii-codes to real characters is possible with the chr()-function. Unfortunately, you only can pass one character at one go. Now it's possible to pass a list of ascii-codes via an array or via a variable parameter list of ascii-codes

Example

"Abc" == chr(65, 98, 99)

Microtime default parameter

A very useless default parameter is the one of microtime(). I can remember, with PHP 4 everyone used a explode() + subtractaction to work around microtime()'s return value. With PHP 5 it became possible to return the time as double, but this is not the default. I broke the API compatibility here and return the µ-time as a double by default.

Example

$time = microtime();

UTF-8 and ENT_QUOTES as default

As most web applications should work with UTF-8 to make i18n more easy, it is a good idea to bring UTF-8 as default into the game. The same is true of ENT_QUOTES. Okay, I must admit, this change is also a little product of laziness because I hate writing ENT_QUOTES, "UTF-8" - thus this was the last time.

$encoded = htmlspecialchars($ugly);

Disable include warnings

A really annoying problem is when include-warnings spam the logfile, if you put aside file_exists() checks. You could add a @-sign in front of the include command, but this forces PHP to be silence for the entire file. PHP now has a new ini directive ignore_include_warning to be able to disable include warnings with ini_set() or globally.

Omitting quoting with json_encode()

Quoting is necessary to satisfy the json protocol. As an extension, it is sometimes nice to define callbacks in a json string. I added a new bit-mask constant namely JSON_CALLBACK_CHECK in addition to the undocumented JSON_NUMERIC_CHECK. If the callback-check flag is set, the prefix __cb of a string value indicates a not quoted callback string.

Example

$json = json_encode(array(
	"func" => "__cbRaise",
	"number" => "1234",
	"native" => 9876,
	"nocb" => "__cb is the beginning, but it isn't a Callback",
	"123" => "text"
), JSON_NUMERIC_CHECK | JSON_CALLBACK_CHECK);

MySQLi/mysqlnd changes

Native type casting turned on by default

I think it's a good idea to turn on native type casting by default. This reduces cache sizes of installations where people don't care about something like that and increases also the execution performance if numbers from databases are heavily involved in calculations.

mysqli_fetch_all() returns associative arrays by default

The MySQLi function mysqli_fetch_all() returns an indexed array by default. The performance benefit doing so is very low; using associative arrays should be better with regard of easy and readable code.

MySQLi matched rows

The MySQLi attribute matched_rows and the attendant procedural mysqli_matched_rows() function return the number of matched rows of the last SQL operation. If you updadate a table and the affected_rows number is e.g. 5, this doesn't mean that 5 is also the number of elements that have matched the WHERE clause. If you want to retrieve the number, you need to run another SELECT COUNT(1) query with the same condition or parse the mysqli_info() output for yourself instead.

mysqli_return($res,[$free=false])

The function mysqli_return() is the equivalent to mysql_result(). The difference of mysqli_return() to it's older pendant is, that the MySQLi version free's it's ressource after returning the value by default. You can turn off this behaviour, but I wanted a function which can be used to return and free a single value instantly.

Example

public function value($query) {
	$res = mysqli_query($this->db, $query);
	return mysqli_return($res);
}

Tidied up PHP

The PHP fork is rid of the followin old and depricated functionalities in order to make the code base smaller and to improve the execution time. This may limit the usage of PHP under some scenarios but first read on at the next section.

  • Deleted define_syslog_variables
  • Deleted magic quotes
  • Deleted register globals
  • Deleted ASP-tags
  • Deleted short open tags and <?php= is the new <?=
  • Deleted allow_call_time_pass_reference
  • Reduced the default memory limit to 16MB
  • Deleted safe mode
  • Deleted disable functions/classes

New ini file

I've added a new ini file in order to have more control over PHP. It's possible to define and delete constants, declare variables as SUPER and rename and delete functions and classes. The new ini file looks like this:

[Constant]

;;; General Config
string ADMINPASS        = "admin";
string ADMINMAIL        = "robert@xarg.org";

string PASSSALT         = "I'm a very good password salt";
int ONLINE_TIME         = 1200; == session.gc_maxlifetime

;;; DB Config
string DB_USER          = "root";
string DB_PASS          = "";
int CLUSTER_SIZE        = 31; (1 << 5) - 1

;;; Test
float TEST              = "333.32=ddd";
delete PHP_VERSION_ID;
delete CURLOPT_SSLCERTTYPE;

;;; SQL Shorthands
string SQL_USER         = "UID, UName, USex, UPic";
string SQL_NOLOG        = "SET SQL_LOG_BIN = 0";
string SQL_TIMEOUT      = "SET WAIT_TIMEOUT = 3600";
string SQL_DATE         = "'%d.%m.%Y'";
string SQL_TIME         = "'%H:%i'";
string SQL_DATETIME     = "'%d.%m.%Y %H:%i:%s'";

[Variable]
super test;
super time;

[Function]
;rename strlen          = abc;
;delete substr_count;
;delete substr;

[Class]
delete stdClass;

I named the file php-global.conf which can be defined with a php.ini variable like so

globals.filename = "/etc/php-global.conf"

Bug fixes

  • chroot() wasn't enabled for fpm, just for the cgi SAPI
  • Sending "private" with the nocache session cache limiter
  • Make "false" printable with print_r instead of an empty string

Some further ideas (not implemented yet)

  • A new preg_replace() modifier to upper and lower strings directly
  • unpack() returns an array even if the count of the array is 1. A mixed type would be save the array handling; internally and in the user space
  • If one character is used with arithmetic operation, it COULD be used as ascii-code instead of parsing the string
  • I wrote my own mysqli_real_escape() function based on the code of libmysqlclient. This function is strongly optimized and therefore faster. Additionally, it does not need a connection handle. I would be glad to see an escaping function, which can get the encoding from the local configuration instead of using a database handle.
  • I did not investigated time finding out if APC optimizes $i++ to ++$i if the value is read-only. If not, it would be cool having such a feature directly in the core to save some time. But maybe this is a better job for an opcode optimizer, which also reduces the number of redundant jumps and so on.

Ready for takeoff

If you like the features of the modified PHP 5.3.6, you can get it from github. I would be glad to hear further improvements that should get implemented and also hear what you think about the changes I made.

You might also be interested in the following

85 Comments on „PHP Hacking”

Raymond
Raymond

I would like to see more objects:

String::length()
String::pos()
Number::format()

nandy
nandy

Thank for sharing your ideas and also for your edited php version.. i downloaded it already. thank you!

Wilfried Loche

Impressive fork! I really like optimizations and all things that makes the developer's life easier, specially the [] notation instead of array(), or foreach () on strings.

But, as other users, backward incompatibilities must be taken cautiously (exceptions for ALL deprecated functions/method/classes and .ini flags :)): I have 2 majors web site in production that I wouldn't be happy to entirely review (or ask for reviewing :)) the code to be compliant with a new PHP release.

Hope to see (most of) your stuff in PHP 5.4!

A PHP user/developer,
Wilfried

BUI
BUI

I like to see PHP++

Robert

@Raghu,

I removed the disable_{classes, functions} feature, because I added a new ini file, which allows class-, function-, constant- removal besides adding new constants, declare variables as super and so on.

It's a matter of taste, of course; but I work mostly with mysqli, as it offers the biggest control over MySQL and mysqlnd and thus, have not many experiences to improve PDO.

Raghu Veer
Raghu Veer

I appreciate your efforts, good work.

hopefully "leaving code to enable/disable functions/classes" is helpful kind of... even though permanently removing insecure methods from PHP is nice, sometime, some may require the functionality, so, they may attempt to enable them under restricted conditions know in their setups.

just some thoughts, thank you

will be glad to see most of these features, if not all in PHP mainstream.

please leave OOP intact, like you also wish to (I personally like procedural mostly, I prefer it alot which is my personal choice ofcourse).

Please concentrate on improving PDO, which do good overall when dealing with sql databases.

We are using pdo from many years, whichever php script or php framework I come across these days, most of them use or are changing to PDO as standard approach for sql databases.

just some thoughts, congrats again, thank you

Paul

Whoa, these are some awesome language constructs! I agree with almost all of them, and I would love to use a bunch of them today, like binary (0b1010) and defining static arrays using square-brackets. And speed-improvements to strlen. And xround, sigfig, and bround!

I don't like the idea of forking source code just to get new features, but it's really tempting. I worry how hard it will be to keep up with future PHP versions, and not lose these cool language add-ons.

I hope the main PHP code maintainers can find a way to safely integrate lots of these into the main distribution. Language advancements don't have to be huge changes; sometimes little improvements can be REALLY nice, even enjoyable.

It's more motivation for developers to find a way to upgrade to the current releases in the future.

Jakub Vrana

Kudos for perfect guerrilla approach. I've added the documentation of JSON_NUMERIC_CHECK - http://docs.php.net/manual/en/json.constants.php - thanks for pointing that out.

I agree with something but I don't agree with everything - reasoning is available at http://php.vrana.cz/fork-php.php - in Czech :-). Google Translate is readable a little bit: http://translate.google.com/translate?u=http%3A%2F%2Fphp.vrana.cz%2Ffork-php.php

Mike
Mike

Hopefully this will get the main line moving again which I take from what I've read is what you hope too. Sometimes when you have the same group working on a project for years it takes the 'new guy' getting in the middle of everything and stepping on a few toes to get people going again. That it took them over a year to even look at some of your code before when you summited it tells me something is broken that needs to be looked at on their part and resolved. Good luck and keep up the good work.

Camilo Zambrano

Dear Robert,

I have read all of your post and I was completely amazed. First of all I would like you to communicate with me, cos I'm creating with my teammates a new PHP-based Framework, we called it Silverfang, and your experience could motivate the team.
Besides, I'd like to propose you to think in these additions:

>> JavaScript collection definition support, it'd be interesting if I could define an associative array only with "{}".
>> C++ - like namespaces would be more treatable than current namespaces.
>> Try to find a way to communicate javascript with PHP in both ways, this could facilitate the creation of amazing GUI's without a lot of work.

I think that is all, if you're interested in share your experience please send me a mail to camilo@silverfng.net

Greetings!

JennyS
JennyS

+1 on the array literal syntax! It's been needed for a long time. I don't know if an object literal syntax is needed - the one use-case for an array literal I can see is to pass named optional params to a function, & I don't see how making it an anonymous object helps there.

+1 on the 0b... notation. Hexadecimal only takes readability so far when you're specifying bitflags.

+1 on foreach. I've always thought that foreach should be as much of a generalized iterator as is feasible.

+1 on the ability to declare variables as superglobals.

+1 on bound(). Cleans up a very common annoyance.

How about the ability to declare ALL global variables as superglobals, like every other C-style language treats them? (I know this could be a dangerous request, as it would create awfully subtle bugs in legacy code.) Or lacking that, at least a method to create a singleton object that is automatically a superglobal? Something like this:

$cfg = new singleton MyConfigurationVarsClass ();
$err = new singleton MyErrReportingClass ();

Ernestas
Ernestas

@marco, try using imagemagick php extension. Its used in Pixlemator too. http://php.net/manual/en/book.imagick.php
Probably you wont be able to do face recognition app, but still its nice lib and much faster then GD. I wish it would be default for PHP.

Are P
Are P

Many of these changes could be made into a php extension, if it's not possible to get it into the official PHP repo.
That way it's easier for more people to use them.

Jonathon Hibbard

My comments (if they matter):

The new usage of creating arrays is cool. It makes managing arrays that much easier as it is also the way one uses/declares an array through JSON (along with JS).

Default to UTF-8 has been something I've been wanting to see for a long time.

MySQLi and mysqlnd changes, to me, are a bit odd... I would rather have seen an improvement to PDO instead as I know very few who even use MySQLi/mysqlnd anymore...

The foreach update is kinda ugly... Trust me, I don't mean that in a bashing way (you obviously know more about C/C#/C++ than I will ever know), yet there obviously should be a different method/language construct for doing this same process. Mainly because, now the foreach has yet another meaning, and keeping track of them all are already a bit cumbersome. Just my 2c's

The addition of adding SUPER is freaking amazing and has been much needed -- along with the removal of the garbage settings (magic quotes, etc) that really shouldn't even be available anymore (I honestly believe they exist to keep a very minority percentage of devs happy).

the microtime method may be the one method that will prevent implementation. while daemons shouldn't be written in PHP, it is and has been done. otherwise, i believe i would be very open to trying this build.

Overall, this is an amazing fork though. I really like the improvements. It screams "why haven't the php core devs seriously considered these changes or implemented them?".

Thanks for the effort, the dedication, and the amazing work. I'll def try to use this version on my personal side projects though!

Cheers,
Jon

Aldonio
Aldonio

I really approve this :D

Marco Rodrigues
Marco Rodrigues

I hope these changes get into official PHP version, they're great =)

Adriano Ramos
Adriano Ramos

Nice work Robert Eisele! Goodbye legacy functions...

m a r c o
m a r c o

Why not? Anyway PHP still remains unusable (too slow) for online image filtering :-/
I think you've got an error in line $str(strlen($str) - 1]
that should be $str[strlen($str) - 1]

Felipe
Felipe

Great job!
PHP was really needing a fork, to clean the old backwards-compatible shit and evolve free from the php-internals non-sense bureaucracy!

Congratulations!

Louis
Louis

Nice job dude ! Keep koing ! :-)

Ben
Ben

I'd want to use just for the short arrays!

Robert

Cezary, I think the problem should be easily recoverable. I'll take a look at the whole at the weekend

Cezary
Cezary

How hard would it be to fix this bug?
http://bugs.php.net/bug.php?id=55032

Robert

@bungle, I read your patch and the idea reads well. The problem I see is that automatic escaping brings a lot of problems into play - look at magic quotes.

Anyway, I described a similar thing in the article about a transparent query layer over here: http://www.xarg.org/2010/11/transparent-query-layer-for-mysql/ - The user input is written unfiltered to a storage system (SQL/NoSQL/flat file) and before it is sent to the client, it get's defused to prevent XSS. But the implemented approach escapes everything expect content, which is returned by the outermost scope of PHP. You filter out numbers and safe strings, too, which in turn brings a hughe overhead into play (much more as magic quotes). Furthermore the implementation looks also a bit slow, using htmlspecialchars() via call_user_func(). I'll publish the way I escape output properly and developer-friendly, soon.

bungle
bungle

How about including this:
https://github.com/xmalloc/php-auto-escape

Robert Eisele

For all who have problems getting it compiled correctly, try installing bison and re2c and run the configure again!

I will deliver seperate patches for each change soon.

@Merlijn, you're right open source lives of the we-thought, but as I've said often enough, I have tried the official way years ago and as it turns out, the way I'm gone now is much more effective to push through some changes and to build an own community. People who have the knowledge to change something are more than enough, it lacks on initiators.

@Corey: I agree that the community is the lifeblood of the project and thy should listen to them from time to time.

@Kaffee, there is a nice discussion on hacker news about my publications: http://news.ycombinator.com/item?id=2640756 I totally agree most of the things said over there, especially that hardcoding of functions can not be the way to go. The function calling is slow at all, optimizing this alltogether would improve the performance of PHP for every function call. I think strlen() and count() are not yet wrong in the core of PHP as they are used very often.

Making everything OOP as Alex Dowgailenko has said, would obselete the problem, because you would access a "length" attribute of an object directly - and in the length of a string is known at all times by the zval data type. My patch simply makes this attribute accessable. About the PI: do you see any advertising on my blog?

@Christian Sciberras, strcut() is not only to cut a string. It also keep words the whole words and appends a filling string, for example: strcut("Hello this is my blog", 15) whill result in "Hello this is...". instead of "Hello this is m". But the idea of returning an object with microtime() is very cool!

@azat, no my site is running on an unoptimized (sponsored) host. I think I'll order an own server in a few weeks in order to make it faster. My blog is already acceptable fast, but this is probably the work of my framework.

@Sara Golemon, yes I know the JIT behaviour. I worked on another place around the super globals and did not published it completely. In hindsight the published patch for $_REQUEST doesn't make sense. But thanks for the hint!

@mario: Removing short_tags solves a lot of problems! The biggest would be the usage of XML (including XHTML) with PHP. It was a nice idea to allow the user the way he wants to write a script but a few more restrictions with regard to other standards make the development more easy. Not least those things unnecessary inflate the code.

@SEO-loving "PHP Werbeagentur", I think the difference between PHP and jQuery is that jQuery is developed by only one main developer full-time with many contributors behind while PHP is developed by many developers in their spare time. In my view, PHP 6 must be a complete rewrite of PHP from scratch based on the experiences made in the previous years. A new VM is needed, a consistently syntax and OOP-usage is needed without regard to backward compatibility. This way new projects could get implemented on PHP6 and all others use PHP 5.x for - let's say 2-3 years until the version is discarded. Just as it was with PHP4->PHP5 at that time.

@RobertK: As I wrote above, rebuild the bison and re2c file. If nothing helps, try
touch Zend/zand_language* git could mangled the timestamps.

EllisGL
EllisGL

@Robert: Yeah - the isset() was a bad example. I couldn't think of anything else off the top of my head.

Mike

Short array syntax! My Hero!

Rodrigo

sound of anarq :)

RobertK
RobertK

@Seb can you describe how you solved problem with asp_tags and short_tags in zend_language_scanner? I've got the same problem...

Seb
Seb

[error] [client 127.0.0.1] PHP Fatal error: Call to undefined function count()

count() is disable ? :s

Petah
Petah

I dig the short arrays, function renames, and speed optimizations.

What I Don't dig, is the backwards compatibility changes.

PHP Werbeagentur

One of the upcoming PHP versions should really implement some of this improvements. Especially the strlen stuff is super sweet.

Core-Team People *have* to change their mindset sooner or later, or PHP will suffer from it. Hey folks, be more like jQuery:
- Lots and lots of Community Contributions
- Huge performance improvements with every release
- A Year 2010+ Syntax which feels fluffy
PHP5.3 was a huge step forward. Go on like that!

Wil Moore III

It might also be a good idea to link to this post from the github repo and maybe even use the wiki to link to some of the interesting posts as listed in some of the comments.

Wil Moore III

Thanks for doing this. If nothing else, please do get the array literals into core. Every other sane language has them and it makes the syntax more concise.

Excellent work.

mario
mario

So that's an exceptionally cool effort I must say. If I had a little more C experience I would immediately sign up to further this anti-PHP-stagnation effort.

But I've got also some nitpicks of course. A few of the new functions really belong better in userland. And the removal of asp_tags and short_tags gave me a fancy compilation error. That's also a bit pointless I must say. Removing the short_tags doesn't solve many actual problems, a new syntax just creates incompatilization for no gain.

Likewise does the removal of allow_call_time_pass_reference make no sense to me. The deprecation didn't. While I don't use/need it, I see the advantage of that option. And removing that just degrades the high levelness of PHP. Nevermind that you can just use `call_user_func_array("func", array( & $ref ) );` anyway. But that's cumbersome, and an unneeded workaround. So the speed improvements from removing the built-in language capabilities should better be measurable.

Lastly, I would also prefer a separate patch. That would allow to selectively enable features and e.g. distributions to include it more easily I guess (and also avoid the artistic license conditions bickering, btw).

Seb
Seb

I try to compil with :

./configure --with-apxs2=/usr/bin/apxs2 --with-pdo-mysql

I have this error on make :

Zend/zend_language_scanner.l:1599:10: error: ‘struct _zend_compiler_globals’ has no member named ‘short_tags’
Zend/zend_language_scanner.l:1604:10: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1573:6: error: ‘struct _zend_compiler_globals’ has no member named ‘short_tags’
Zend/zend_language_scanner.l:1550:6: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1524:6: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1537:6: error: ‘struct _zend_compiler_globals’ has no member named ‘short_tags’
Zend/zend_language_scanner.l:1713:10: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1779:6: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1483:3: warning: ignoring return value of ‘getcwd’, declared with attribute warn_unused_result
make: *** [Zend/zend_language_scanner.lo] Erreur 1


Any idea ?

node.js tutorial

does this hack works with large platform?

hSATAC
hSATAC

@pistole so did you manage to build this? I got re2c and bison 2.3 installed and the ./configure output looks ok, but I still got the same error message.

not-just-yeti
not-just-yeti

Actual improvement measurements with benchmarks are definitely in order. +1 to everything Kaffee said.

RaoHe

Pretty much everything was a bit yey from me! But as many already proposed, push them to offical core?

But still, awesome job!
Array lazy style really tickled my fancy.

Bruno Magalhães
Bruno Magalhães

Wonderful work! Nicely done!!! I will definitely have a try on this! Keep it going, and if you need help I would love to help!!!

Christian
Christian

Impressive!

I should congratulate you for this! But I only have a question ... Instead of fork it, why didn't you contribute all this features to PHP?

Greetings & Congrats for the work!
Christian

Timothy Warren

I agree with most of these, except for short tags. I don't see any reason that

Sara Golemon

FYI, auto globals (like $_REQUEST) can have an optional "arming" function which the compiler calls when use of it is encountered in the script code. You could make use of this to populate $_REQUEST (or the others) on demand, rather than making it an explicit config option.

Martin
Martin

A lot of these additions and changes do make sense and solve everyday problems and nuisances for PHP developers. However, I doubt the Kremlin will allow many of them. Can remember how a similar suggestion for array creation got shot down on internals - deemed not necessary.

pistole
pistole

@Robert I already had bison and re2c installed, I had to manually delete the files in the commented out CLEANFILES def from Zend/Makefile.am before it'd re-generate them.

I think git may have broken the timestamps on the files, leading make to not consider them changed.

pistole
pistole

@hSATAC having the same compile problem

Robert

@EllisGL,

isset() isn't a function. It's a language-construct and it's a good idea to distinguish them in this way. But PHP has a lot of wired namings - above all, string functions like str_replace() and strpos().

Robert

@hSATAC,

you have to install re2c and bison in order to generate the new files. I did not do that because otherwise the diff would be very bloated. Once you've installed re2c and bison, run the configure script again. After that make install and you're done.

Robert

EllisGL
EllisGL

Oh an also UTF-16/32 support.

EllisGL
EllisGL

Great work. I hope your stuff takes off!

A couple things I would like to see:
1. Better support for serial communications for both unix/win.

2. Needle/haystack argument ordering.

3. A standard naming convention. is is_array vs isset...

hSATAC
hSATAC

Is there anyone build this successfully?
I tried to make it on a clean ubuntu vm.

./configure --prefix=/usr/local/lib/php5.3.6.infusion --with-apxs2=/usr/bin/apxs2
make

and it gave me:

Zend/zend_language_scanner.l: In function ‘lex_scan’:
Zend/zend_language_scanner.l:1599:10: error: ‘struct _zend_compiler_globals’ has no member named ‘short_tags’
Zend/zend_language_scanner.l:1604:10: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1573:6: error: ‘struct _zend_compiler_globals’ has no member named ‘short_tags’
Zend/zend_language_scanner.l:1550:6: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1524:6: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1537:6: error: ‘struct _zend_compiler_globals’ has no member named ‘short_tags’
Zend/zend_language_scanner.l:1713:10: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1779:6: error: ‘struct _zend_compiler_globals’ has no member named ‘asp_tags’
Zend/zend_language_scanner.l:1483:3: warning: ignoring return value of ‘getcwd’, declared with attribute warn_unused_result
make: *** [Zend/zend_language_scanner.lo] Error 1

azat
azat

Yes it sounds good.
This site is running inder your version PHP?
Because it realy fast :)

tonybaldwin

I, too, kind of wondering why you're forking instead of submitting these upstream.

Marcel Esser
Marcel Esser

Some of these patches are really great ideas. Congrats on your hard work.

Spechal

I really do wish some of these functions were in the core. Great work and I hope the "we"'s see the "they"s and theirs wants.

JP
JP

I think your initiative is great.

PHP should move faster and in the right direction, and you have some good ideas there that I think they should consider.

One year to read your proposal is way too slow...

Christian Sciberras
Christian Sciberras

Oh, forgot to comment about microtime() compatibility. Instead of returning a float, you should make it return an object with magic method __toString() and __toFloat() (you'll have to support __toFloat() though). There, it's compatible and the way you want it!

Christian Sciberras
Christian Sciberras

Hello!

In Delphi/pascal there's a very handy function call setlength(). You can both set the length of a string as well as an array.

I think it's much more appropriate than strcut() since it's multi-purpose. Not sure how slower it would make it though.

Juan Felipe Alvarez Saldarriaga
Juan Felipe Alvarez Saldarriaga

Great work!, also objects new syntax will be nice too :) https://wiki.php.net/rfc/objectarrayliterals, keep going! :D

Anthony W.

I was seriously looking at the failure of PHP to loop through strings in a foreach() construct the other day. I hope this features makes it way into 5.4 as a patch.

Kaffee
Kaffee

You start the whole post with "Performance improvements" - but i miss some benchmarks with results :(

And small patches contributed and discussed at php.net, instead of a fork, would be the best way to improve the language and not only the page impressions of your blog!

But some ideas are really good and it would be awesome to get those into the official php. :)

Simon

Looks good (and extensive) - lots of edits - but why not just patch and fix the main php code?

Gérald

At least you made your point...... a lot of people are talking about your patches !

Marcio Albuquerque
Marcio Albuquerque

This is VERY NICE! But, why didn't you submit this to PHP patches? I think this is the wrong way and misplaced.

Philip Olson
Philip Olson

It's worth mentioning that <?= will always be available as of PHP 5.4, as it's been decoupled from short tags. Also, str_random() is great. I once posted a related RFC ( https://wiki.php.net/rfc/get-random ) but forgot about it. Maybe that'd help someone. And short array syntax is planned but specifics are being worked out (see also the objectarrayliterals rfc), but I'm fairly certain a version will happen in 5.4. And lastly, everyone, fork isn't a bad word... just ask github. :)

Alex Dowgailenko
Alex Dowgailenko

Interesting changes, but I think the effort is misplaced when it comes to actually improving PHP. eg: we need to get rid of php.ini, Array type should be an instance of ArrayObject, with all array manipulation functions being methods of ArrayObject:

$newArray = array_merge($array1, $array2);
versus
$array1->merge($array2);

Error handling sucks. Bring in the finally block and make everything an exception and get rid of the '@' symbol:

if (mkdir('/foo/bar')) { // carry on } else { // didnt work}

is a classic example. If /foo/bar already exists, mkdir throws a warning to stdout, which is something I probably don't care about. But to avoid it, I have to check for the existence of the directory first and put a @ infront of mkdir? Strange......

While your work certainly isn't BAD (python-esque array syntax is awesome in fact), I just think there are more pressing issues to resolve first, that's all.

Corey

@Pierre, I believe internals is slightly jaded in their opinions. I was a little disheartened by the team and community discrepancies over short syntax support. Listen to the community; they're the lifeblood of this language. Over the last few years there has been an exodus to languages like Ruby, Python, and Node. Some interesting debates were brought up in the recent "5.4 moving forward" thread.

For anybody who hasn't had a chance to read the discussion, check out this abridged version:

http://www.serverphorums.com/read.php?7,328571

Lastly, here's the PHP 5.4 Todo Wiki for referencing proposed changes:

https://wiki.php.net/todo/php54

Merlijn

Some good ideas....but...

I only read " I wrote".. "I did"... I always prefer "we" above "I".
And agree with the others... violation of open source licenses is lame. One of the great things of open source projects is that there are discussions. Discussions leading to consensus. Even an "I" project like python can't not exist without a lot of "we"s.

Hope you will bother getting the changes (some of them are ok imho) into the real PHP branch.

Robert

Derick,

thanks for the clue, but I don't want to leave it as a stand alone project. I modified PHP as proof of concept in order to get these changes into one of the next releases. That's also why I published it as PHP5.3.6, because it's originated from PHP5.3.6.

Pierre, I published several patches on the bug tracker. The first got read one year after publication, that's why I tried this way.

Derick
Derick

Hi,

I'm not sure if you're aware of it, but you can actually not release a fork of PHP with the name "PHP". From the license (http://www.php.net/license/3_01.txt):

4. Products derived from this software may not be called "PHP", nor
may "PHP" appear in their name, without prior written permission
from group@php.net. You may indicate that your software works in
conjunction with PHP by saying "Foo for PHP" instead of calling
it "PHP Foo" or "phpfoo"

cheers,
Derick

Pierre

Most of these changes are rather amusing instead of really useful or to justify a fork. Maybe the foreach one could be handy.

But what amazed me a lot is that I can't find a post from you on bugs.php.net or internals, I think you miss a critical part of what makes OSS.

Clement Herreman
Clement Herreman

Nice work you've done, and it's encouraging to make PHP evolve in the right way.

However you wrote : "I would leave a simple singleton-structure for encapsulation, everything else is too much for a web-programming language."

When it come to writing large and powerfull web app, good OOP is the bare minimum to have some standards of quality and maintainability (among other qualities).

mike

@BG - how is Drupal the wrong way to do things?

Infinite extensibility without having to introduce OOP.

The comment about removing OOP - that's wishful thinking. I've made a great living without having to write a single line of OOP code. I could go into why I don't feel it is necessary... but that's not really in scope here, but along the lines of "too much for a web-programming language." :)

@Robert:

I'm all for some of these basic improvements to PHP core, and not sure why some of these are not introduced. As long as they don't break BC...

As this is your project, you should be able to prefix your functions however you want :)

As far as the short array syntax, is this supposed to emulate JSON exactly? Unless it is syntactically exact to the language, it's still a different syntax to learn, and winds up making someone have to think "oh wait that's PHP's version of this syntax!" ... I think PHP is damn simple and straightforward. Want an array? You say "array" ... boom! :p

Robert

@Fabian: I'll try to bring some of the changes in PHP5.4

@hSATAC: It would be possible to make it a standalone patch. It was really easy to add this - just a new line to the bison-rule file.

@mike: I had my own function-prefix but were outvoted to use the standard PHP naming conventions.

I wouldn't remove OOP, even if I don't understand why Java is more and more simulated. I would leave a simple singleton-structure for encapsulation, everything else is too much for a web-programming language. Not least because a website is only rendered from the top to the bottom. But this decision is off my hands.

The short-array syntax is very useful. When you look at PHP's history, it was ever PHP's aim to allow beginners developping rapidly PHP no matter what programming background he had.

If one needs a few more examples of the new functions can take a look at my infusion extension. A good bunch of these functions were added to this fork (but were optimized as well): http://www.xarg.org/2009/12/php-enhancement/

@Remo: The PHP fork is used in production besides a bunch of other changes to the core. I have no intention to maintain an own fork for long, even if articles such as http://pooteeweet.org/blog/0/1689 could tempt to do so.

About ob_fwrite(), take a look at my newly published article: http://www.xarg.org/2011/06/faster-php-behind-fastcgi/

@Carl: I've developed a solution for the templating problem you're address. I still work on the article for that stuff, stay tuned.

Carl Helmertz

This is the most exciting news about php in a long time, good job! But I fear that it won't reach all those benefiting from it if the implementations stays in a fork while it seems that most stuff would be ***ing awesome to have - it's just too much trouble to keep track of a fork for most of us. Never the less - really good thinking, these are the kind of stuff php needs :)

I've got an idea to support easier templating, a shorter syntax for escaping output via (for example) , where the colon means "escape and echo".

Cheers//Carl

Remo
Remo

There is some great stuff! I like strmap and such things, not sure if we really need methods like ob_fwrite as it isn't used on a daily basis.

But mostly I like that you've clean up a few things like safe mode, short tags. Having less options keeps the code clean

But please keep OOP, it would be a nightmare without it!

I'm not sure if these things are enough to move away from the official PHP release though. As a few others asked, any chance to include a few things in the official PHP repository?

BG
BG

Mike said: "Drupal is a good model of how to extend anything with procedural hooks :p"

Wow...Drupal is probably the worst example of how to do things .

Anyways that is just my opinion (you have the right to yours), I just don't see more people using PHP if you took out OO.

mike

Ah, I just had to add some more :)

+1 strmap, strcut, timechop
+1 for all the "tidied up PHP" stuff
+1 for removing $_REQUEST by default (it shouldn't be allowed at all!)
+1 for UTF-8 and ENT_QUOTES as default
+1 for performance optimizations

-1 you implemented str_ functions alongside str functions. You should at least keep your implementation consistent :)

mike

This is cool! :)

Hopefully you'll clean up the needle vs. haystack ordering, the function naming inconsistencies str_parse vs. strtoupper, etc.) maybe prefix a few others with more consistent prefixes.

Should add in named parameters to functions! :)

Personally I'd remove all OOP :) One consistent format for them all: procedural. Drupal is a good model of how to extend anything with procedural hooks :p

I'd also remove namespaces. I don't see the necessity for them. I think it's just further bloating and distorting the language and trying to make it match other languages.

I have to say, -1 on the short array syntax. I don't think it needs it. Why make multiple paths to the same destination.

+1 for the mysqli_real_escape() equivalent. I always wondered why it really needed a db handle. Just make something that is binary/Unicode safe that escapes properly.

hSATAC
hSATAC

I love the array style too!
is it possible to make it a stand-alone patch or something? so that we could apply to any version of php.

Yassine
Yassine

I really love the javascript style arrays :D

gphilip
gphilip

Sounds awesome, all of them could really go to the official repo... try to get them there :)

Fabian
Fabian

This sounds pretty awesome. Did you try to get those changes into the official PHP repository?

 

Sorry, comments are closed for this article. Contact me if you have an inventive contribution.