Overview

Introduction

Cassandra Universal Driver is an Open Source solution aimed to solve one problem with this popular database.

The lack of reliable drivers.

Datastax has published 4 drivers:

  • DataStax C# Client Driver
  • DataStax Java Client Driver
  • DataStax C++ Client Driver
  • DataStax Python Driver

datastax-planetcassandra-drivers-cassandraSo there are those 4 drivers, and there are a lot of drivers community created that doesn’t work, or were abandoned long time ago, or doesn’t support the last CQL (Cassandra Query Language).

Motivation

I wanted to work with Cassandra and PHP, and I found nothing working for my needs, so first I created CQLSÍ for PHP and added to my PHP Framework. CQLSÍ (CQL Simple Interface) is a wrapper that invokes the original cqlsh tool from Datastax sending the queries in a file, and captures and parses the output and the errors, and offers all the info in a abstraction layer array easy to work with.

It was cool, and enough for my needs, and allowed to use set, map, list very easily, as all the data is provided as String ¡Time saving!.

Also supports perfectly utf-8.

I was so happy when an student from the Kennesaw university, David Lebrón, contacted me and explained that he and other students were using it at the university, and creating their programs with PHP instead of Java!.

CQLSÍ has not much performance, as it relies on shell calls, at that is expensive in CPU, but it works very well, and allow batch INSERTS at once, and I use in some of my projects. I use specially for crons fetching data and running batch queries.

History

I was creating another solutions, much much better, able to keep the connections (reuse connection to Cluster).

I contacted Scott from Datastax, explained that the lack of PHP support was  so sad and offered them to hire me to solve all this.

They were so nice and after some delay I was interviewed by Skype by Michaël.

I’m not a Marketing guy and I think I don’t explain my strengths very well. I’m a doer. I do cool things, and later I forgot them. I’m not worried about selling myself, I know that if someone gives me a problem I’ll solve it or tell from the beginning no. If I had not written in my CV some of my coolest creations, probably I would have forgot many. I had the same sensation when I was interviewing by Facebook and Amazon for Manager roles.

But I explained in a long email my plans for building a C extension for the PHP driver and to cover specifically the webserver world and to integrate it with the PHP Community, ensuring adoption by popular Frameworks like Zend2, Symfony2, Laravel… and after a week of delivering he decided to go for a pure C++ guy instead, (I’m multilanguage) as he wants a super-performance driver for all the scenarios with async support, etc…

I’m more for spiral development -phase by phase-, and 80/20. Something that does not compromises the time to market.

I wanted to deliver very quickly a solutions that solves the 90% of the cases, that are web. PHP is used mainly for web.

Many of my friends want to use Cassandra and PHP and have no better option other than using other databases.

I decided to continue contributing to the project, from the Community.

For the other languages and PHP projects that need more performance there was still a need.

Conclusion

I created for myself a solution with a lot of performance that keeps in memory and reuses the connections, and is very very efficient. It has some work to finish it publicly and make it super, so in the meantime I decided to release  an Open Source simple solution, that allows any language “speaking” Tcp/Ip to interface with Cassandra.

Then I created the “Cassandra Universal Driver” that is a gateway, or interface, based on Tcp/Ip -a webservice- that relies on the Datastax python driver to offer latest CQL support to any language.

It is written in Python, is Open Source and it provides sample codes for using for many languages like: PHP, node.js, Perl, Bash with curl or wget…

Is many times faster than CQLSÍ, is not so fast as using native calls but this is mainly because the connections are not reused (a query in a complete cycle takes 0.1 seconds in my computer), but it is very cool and allows any programming language to be able to use Cassandra via Tcp.

It also allows cool architectures, as you can split the Cassandra Universal Driver Gateways in other servers, and use Load Balancers, as well. Everything is http. So is very clean.

So hope you enjoy it. :)

Leave a Reply

Your email address will not be published. Required fields are marked *