I've just started learning MySQL's internals but I've got an idea which I want to convert to a question here, to ask people who are already deeper in it. Is it a bad idea to completeley avoid query parsing on the server side and use a binary protocol instead? This way the client parses the query and could cache the statement structure for further usage or another client API uses a NoSQL approach to send the request data to the server.
Yoshi has proven, that query parsing consumes a lot of time and wrote HandlerSocket to speed up primary key lookups. I love the idea to outperform memcache with MySQL, but an additional API is needed and in general, it looks like a temporary hack.
Constantly sent queries to a production server are mostly the same or at least have the same structure, with the difference that only different values are passed as parameter. But how many different queries can be counted on a relatively complex project? 50? 100? 200? In any case smaller compared to the fact that each query will be parsed for every request - except the query got a Query Cache hit. But I think, most projects are better off without the Query Cache, because the hit rate on a dynamic dataset does not justify it's use. Anyway, that's another topic. The problem of the normal protocol is, that everything is transfered as text. Integers and floats must also be converted to a string and must be unpacked on the server side.
Using a binary protocol would save bandwidth and significantly improves the performance. mysqlnd already got a very simple binary protocol implementation for the response, where it's possible to write data types without converting it to string to send it back to the client. The same thing would be a nice to have for requests.
Am I wrong with this idea? I think the biggest effort will implementing a new robust protocol for MySQL, removing the parser from the server, writing a new and better query cache (not based on the initial query string but the structure of the query and it's data) and a new version of the libmysqlclient library. mysqlnd as a native library inside of PHP would also need to be changed, but would the work not be worth while?
On the other side query parsing can be avoided by caching it's structure on the client, data will be more accurate (especially floats) and the big win of saving bandwidth, as I already mentioned.
You might also be interested in the following
Sorry, comments are closed for this article. Contact me if you have an inventive contribution.