Apache Hive development has shifted from the original Hive server (HiveServer1) to the new server (HiveServer2), and hence users and developers need to move to the new access tool. However, theres more to this process than simply switching the executable name from hive to beeline. Apache Hive was a heavyweight command-line tool that accepted the command and runsthem utilizing MapReduce. Later, the tool was split into a client-server model, in which HiveServer1 is the server (responsible for compiling and monitoring MapReduce jobs) and Hive CLI is the command-line interface (sends SQL to the server).Recently, the Hive community introduced HiveServer2 which is an enhanced Hive server designed for multi-client concurrency and improved authentication that also provides better support for clients connecting through JDBC and ODBC.Now HiveServer2, with Beeline as the command-line interface, is the recommended solution; HiveServer1 and Hive CLI are deprecated and the latter wont even work with HiveServer2
The primary difference between the Hive CLI & Beeline involves how the clients connect to ApacheHive.
- The Hive CLI, which connects directly to HDFS and the Hive Metastore, and can be used only on a host with access to those services.
- Beeline, which connects to HiveServer2 and requires access to only one .jar file:
hive -h hostname -p portIn contrast, Beeline connects to a remote HiveServer2 instance using JDB
C. Thus, the connection parameter is a JDBC URL thats common in JDBC-based clients:
beeline -u url -n username -p password
Here are a few URL examples:
jdbc:hive2://ubuntu:11000/db2?hive.cli.conf.printheader=true;hive.exec.mode.local.auto.inputbytes.max=9999#stab=salesTable;icol=customerID jdbc:hive2://?hive.cli.conf.printheader=true;hive.exec.mode.local.auto.inputbytes.max=9999#stab=salesTable;icol=customerID jdbc:hive2://ubuntu:11000/db2;user=foo;password=bar jdbc:hive2://server:10001/db;user=foo;password=bar?hive.server2.transport.mode=http;hive.server2.thrift.http.path=hs2
Apache Hive CLI VS Beeline: Query ExecutionExecuting queries in Beeline is very similar to that in Hive CLI. In Hive CLI:
hive -e query in quotes hive -f query file nameIn Beeline:
beeline -e query in quotes beeline -f query file name
In either case, if no -e or -f options are given, both client tools go into an interactive mode in which you can give and execute queries or commands line by line.
Apache Hive CLI VS Beeline: Variables
There are four namespaces for variables:
hiveconffor Hive configuration variables
systemfor system variables
envfor environment variables
hivevarfor Hive variables (HIVE-1096)
There are two ways to define a variable: as a command-line argument or using the
setcommand in interactive mode.
Defining Hive variables in the command line in Hive CLI:
hive -d key=value hive --define key=value hive --hivevar key=value
Defining Hive variables in command line in Beeline
beeline --hivevar key=value
Beeline Operating Modes and HiveServer2 Transport Modes