Standalone Hive
Support
Support with hive 2.3.9.
Configuration
Vendor
HIVE_APACHE
Connection format
The connection URL format for direct connection to hive server is jdbc:hive2://<databse_host>:<database_port>/<database>;principal=<principal>;ssl=true.
The connection URL format via HTTP is jdbc:hive2://<databse_host>:<database_port>/<database>;principal=<principal>;ssl=true;transportMode=http;httpPath=cliservice.
Permissions
User or role with SELECT permissions on the tables or views for which data sampling or metadata extraction is desired.
Hive in Cloudera environment
Support
Support in cdh environments from 6.0 to 6.3 or cdp 7.0 and 7.1 with hive 1.0 up to 3.1.
Configuration
Vendor
HIVE_CLOUDERA
Connection format
The connection URL format for direct connection to hive server is jdbc:hive2://<databse_host>:<database_port>/<database>;principal=<principal>;ssl=true.
The connection URL format via HTTP is jdbc:hive2://<databse_host>:<database_port>/<database>;principal=<principal>;ssl=true;transportMode=http;httpPath=cliservice.
The connection URL format via zookeeper is jdbc:hive2://<databse_host>:<database_port>/<database>;principal=<principal>;ssl=true;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2.
Permissions
User or role with SELECT permissions on the tables or views for which data sampling or metadata extraction is desired.
Specific adjustments for Hive
To execute JDBC queries against Hive, a series of adjustments must be made to the JVM
-
Use of Kerberos authorization
-
SSL
JVM SSL adjustments
Cloudera needs to secure connections with SSL to be able to use Kerberos. The default solution it uses is to deploy its own self-signed CA in a JKS container using properties analogous to these:
-Djavax.net.ssl.keyStore=/....../....../cm-auto-host_keystore.jks
-Djavax.net.ssl.keyStoreType=jks -Djavax.net.ssl.trustStore=/....../...../cm-auto-global_truststore.jks
-Djavax.net.ssl.trustStoreType=jks
There are two options
-
That the paths are provided to the integrator by the Client
-
Adding the Cloudera self-signed CA to the JVM's default certificate container
Provided paths
The plugin's JVM must be executed with the following parameters:
|
Parameter |
Value |
|
-Djavax.net.ssl.keyStore |
Path where the private key store is located. |
|
-Djavax.net.ssl.keyStoreType |
jks |
|
-Djavax.net.ssl.trustStore |
Path where the certificate store is located. |
|
-Djavax.net.ssl.trustStoreType |
jks |
Adding the self-signed CA to the JVM certificate container
The goal is to add the Cloudera self-signed certificate to Java's cacerts
keytool -importkeystore -srckeystore /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks -destkeystore /etc/ssl/certs/java/cacerts -srcstorepass -deststorepass changeit -v
This approach has the advantage of not requiring additional parameters to be added to the JVM.
Kerberos adjustments for the JVM
It is necessary to provide the JVM with the mechanism for the automatic use of Kerberos tickets.
To do this we must launch the JVM with the following parameters.
-
-Djava.security.auth.login.config=security.config
-
-Djavax.security.auth.useSubjectCredsOnly=false
With the security.config file having the following structure:
com.sun.security.jgss.initiate {
com.sun.security.auth.module.Krb5LoginModule required
useTicketCache=true
StoreKey=true
useKeyTab=true
keyTab="/home/ubuntu/usr_anjana.keytab"
principal="usr_anjana@CDP.LOCAL";
};
The keytab and principal parameters must be supplied to us
Common errors
If the Hive tables are encrypted it returns this error log
No Route to Host from ip-10-202-21-147.eu-west-1.compute.internal/10.202.21.147 to ip-10-202-21-122.eu-west-1.compute.internal:8020 failed on socket timeout exception: java.net.NoRouteToHostException