Installation using PIP
# install python-pip sudo apt-get install -y python-pip # install cql driver sudo pip install cql
Manual Installation from the Sources
# download path mkdir -p ~/build/python-cql cd ~/build/python-cql # download git clone https://github.com/pcmanus/python-cql cd python-cql sudo python setup.py install
Query Example
# create and move to diretory mkdir -p ~/build/python-cql cd ~/build/python-cql # create test file cat > test.py <<-"_EOF_" import cql host, port = "127.0.0.1", 9160 keyspace = "mytest" con = cql.connect(host, port, cql_version='3.0.0') print (con) cursor = con.cursor() query="DROP KEYSPACE %s;"%keyspace print(query) r= cursor.execute(query) print(r) query=""" CREATE KEYSPACE %s WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 1}; """%keyspace print(query) r= cursor.execute(query) print(r) query="USE %s;"%keyspace print(query) r= cursor.execute(query) print(r) query=""" CREATE TABLE users ( nickname text, password text, gender text, email text, active int, country text, date_of_birth timestamp, PRIMARY KEY (nickname) ); """ print(query) r= cursor.execute(query) print(r) query="CREATE INDEX country_index ON users (country);" print(query) r= cursor.execute(query) print(r) query="CREATE INDEX active_index ON users (active);" print(query) r= cursor.execute(query) print(r) query=""" INSERT INTO users(nickname, email, country, active) VALUES ('first', 'first@first123.com', 'UK', 1) ; """ print(query) r= cursor.execute(query) print(r) query=""" BEGIN BATCH INSERT INTO users(nickname, email, country, active) VALUES ('asmith', 'a@aaa.com', 'US', 1) ; INSERT INTO users(nickname, email, country, active) VALUES ('bsmith', 'b@bbb.com', 'US', 1) ; APPLY BATCH; """ print(query) r= cursor.execute(query) print(r) query="SELECT COUNT(*) FROM users;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row query="SELECT * FROM users;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row query="SELECT * FROM users where country > 'A' and active=1 ALLOW FILTERING;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row query=""" CREATE TABLE timeline ( nickname text, message text, created timeuuid, ctime timestamp, PRIMARY KEY (nickname, created) ) WITH CLUSTERING ORDER BY (created DESC); """ print(query) r= cursor.execute(query) print(r) import uuid key = str(uuid.uuid4()) import datetime from datetime import datetime dt=datetime.now().strftime("%Y-%m-%d %H:%M:%S") query=""" BEGIN BATCH INSERT INTO timeline(nickname, message, created, ctime) VALUES ('asmith','this message one', now(), '"""+str(dt)+"""' ) ; INSERT INTO timeline(nickname, message, created, ctime) VALUES ('asmith','this message two', """+str(uuid.uuid1())+""" , '"""+str(dt)+"""' ) ; APPLY BATCH; """ print(query) r= cursor.execute(query) print(r) query="SELECT * FROM timeline;" print(query) r= cursor.execute(query) print(r) for row in cursor: # Iteration is equivalent to lots of fetchone() calls print row print datetime.fromtimestamp(long(row[2])) # FINISH cursor.close() con.close() """ import uuid random_uuid = str(uuid.uuid4()) # anonymous global_uuid = str(uuid.uuid1()) # machine id coded inside cursor.execute("CQL QUERY", dict(kw='Foo', kw2='Bar, etc...)) - cursor.description # None initially, list of N tuples that represent the N columns in a row after an execute. Only contains type and name info, not values. - cursor.rowcount # -1 initially, N after an execute - cursor.arraysize # variable size of a fetchmany call - cursor.fetchone() # returns a single row - cursor.fetchmany() # returns self.arraysize # of rows - cursor.fetchall() # returns all rows, don't do this. Query substitution: - Use named parameters and a dictionary of names and values. e.g. execute("SELECT * FROM CF WHERE name=:name", {"name": "Foo"}) """ _EOF_
python test.py
Tips
uuid1 vs uuid4
uuid1 vs uuid4
uuid1()
Generates a UUID from a host ID, sequence number, and the current time – guaranteed collision free
uuid4()
Generates a random UUID, can collide, chance extermely small
See Stackoverflow Reference: http://stackoverflow.com/questions/1785503/when-should-i-use-uuid-uuid1-vs-uuid-uuid4-in-python
Troubleshooting
1. InvalidRequestException – Invalid version value: 3.0
Fix: Change CQL Version from “3.0” to “3.0.0” in order to fix the following exception
a@a:~/build/python-cql$ python test.py Traceback (most recent call last): File "test.py", line 7, incon = cql.connect(host, port, cql_version='3.0') File "/usr/local/lib/python2.7/dist-packages/cql/connection.py", line 143, in connect consistency_level=consistency_level, transport=transport) File "/usr/local/lib/python2.7/dist-packages/cql/connection.py", line 59, in __init__ self.establish_connection() File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 162, in establish_connection self.set_cql_version(self.cql_version) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 165, in set_cql_version self.client.set_cql_version(cql_version) File "/usr/local/lib/python2.7/dist-packages/cql/cassandra/Cassandra.py", line 1983, in set_cql_version self.recv_set_cql_version() File "/usr/local/lib/python2.7/dist-packages/cql/cassandra/Cassandra.py", line 2004, in recv_set_cql_version raise result.ire cql.cassandra.ttypes.InvalidRequestException: InvalidRequestException(why='Invalid version value: 3.0 (see http://semver.org/ for details)')
Source: could-not-connect-cql-30
2. Bad Request: line 4:29 mismatched input ‘:’
Fix: Change
WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1';
TO
WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3}
in order to fix the following exception
a@a:~/build/python-cql$ python test.py ThriftConnection(host='127.0.0.1', port=9160, keyspace=None, conn open) CREATE KEYSPACE mytest WITH strategy_class = 'SimpleStrategy' AND strategy_options:replication_factor='1'; Traceback (most recent call last): File "test.py", line 18, inr= cursor.execute(query) File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 98, in handle_cql_execution_errors raise cql.ProgrammingError("Bad Request: %s" % ire.why) cql.apivalues.ProgrammingError: Bad Request: line 4:29 mismatched input ':' expecting '='
3. error: cannot assign result of function now
Fix: Upgrade Cassandra from Version 1.2.3 to 1.2.4 to fix error
INSERT INTO timeline(nickname, message, created, ctime) VALUES ('asmith','this message two', now(), 7ed26c4e-a452-11e2-92b6-0022fa7a3a68 ) ; Traceback (most recent call last): File "test.py", line 161, inr= cursor.execute(query) File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 98, in handle_cql_execution_errors raise cql.ProgrammingError("Bad Request: %s" % ire.why) cql.apivalues.ProgrammingError: Bad Request: Type error: cannot assign result of function now (type timeuuid) to created (type 'org.apache.cassandra.db.marshal.ReversedType(org.apache.cassandra.db.marshal.TimeUUIDType)')
Source: https://issues.apache.org/jira/browse/CASSANDRA-5386
Check Cassandra Version using any of the following commands:
nodetool -h localhost version #ReleaseVersion: 1.2.4
Manually Updgrade Cassandra to 1.2.3
Cassandra version 1.2.4 is currently not yet available in the ubuntu/datax repositories.
A partial upgrade (not safe for production)
#create directories mkdir -p ~/build/cassandra cd ~/build/cassandra # download version="1.2.4" wget http://apache.openmirror.de/cassandra/$version/apache-cassandra-$version-bin.tar.gz tar -xvf apache-cassandra-$version-bin.tar.gz cd apache-cassandra-$version # move old jar file to old directory sudo mkdir -p /usr/share/cassandra/old/ sudo mv /usr/share/cassandra/lib /usr/share/cassandra/old/ sudo mv /usr/share/cassandra/*1.2.3.jar /usr/share/cassandra/old/ # copy new libraries sudo cp -r lib /usr/share/cassandra/ sudo mv /usr/share/cassandra/lib/apache-cassandra-* /usr/share/cassandra/ # change apache-cassandra link sudo rm /usr/share/cassandra/apache-cassandra.jar sudo ln -s /usr/share/cassandra/apache-cassandra-1.2.4.jar /usr/share/cassandra/apache-cassandra.jar # cassandra sudo /etc/init.d/cassandra restart
Cannot execute this query as it might involve data filtering
Fix: Change query from
SELECT * FROM users where country > 'A' and active=1;
TO
SELECT * FROM users where country > 'A' and active=1 ALLOW FILTERING;
Source: http://www.datastax.com/docs/1.2/cql_cli/cql/SELECT
Error:
SELECT * FROM users where country > 'A' and active=1; Traceback (most recent call last): File "test.py", line 111, inr= cursor.execute(query) File "/usr/local/lib/python2.7/dist-packages/cql/cursor.py", line 80, in execute response = self.get_response(prepared_q, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 77, in get_response return self.handle_cql_execution_errors(doquery, compressed_q, compress, cl) File "/usr/local/lib/python2.7/dist-packages/cql/thrifteries.py", line 98, in handle_cql_execution_errors raise cql.ProgrammingError("Bad Request: %s" % ire.why) cql.apivalues.ProgrammingError: Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING
References
1. Apache Cassandra Home: http://cassandra.apache.org/
2. python-cql Github Home: https://github.com/pcmanus/python-cql
3. python-cql PIP Home: https://pypi.python.org/pypi/cql
4. CQL Tutorial: http://techdocs.acunu.com/v1.1/html/tutorial.html#using-cql
5. Apache Cassandra 1.2 Documentation(PDF): http://www.datastax.com/doc-source/pdf/cassandra12.pdf
6. Cassandra Query Language Manual: http://cassandra.apache.org/doc/cql3/CQL.html
7. Cassandra, CQL3, and Time Series Data With Timeuuid: http://www.rustyrazorblade.com/2012/10/cassandra-cql3-and-time-series-data-with-timeuuid/
8. CQL3 for Cassandra experts:
9. A thrift to CQL3 upgrade guide: http://www.datastax.com/dev/blog/thrift-to-cql3
10. Understanding the Cassandra data model: http://www.datastax.com/docs/1.2/ddl/index
11. Cassandra Data Modeling Best Practices, Part 1: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/
12. PRIMARY KEY’s in CQL: http://planetcassandra.org/blog/post/primary-keys-in-cql
13. CQL3 In Depth: http://www.slideshare.net/yukim/cql3-in-depth
14. CQL and Timeseries Schema: http://stackoverflow.com/questions/13500216/cql-and-timeseries-schema
15. Advanced Time Series with Cassandra: http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra
16. Secondary Index ought to have low cardinality: http://en.wikipedia.org/wiki/Cardinality_(SQL_statements)