  Hive
  HIVE-4160

Vectorized Query Execution in Hive


The Hive query execution engine currently processes one row at a time. A single row of data goes through all the operators before the next row can be processed. This mode of processing is very inefficient in terms of CPU usage. Research has demonstrated that this yields very low instructions per cycle [MonetDB X100]. Also currently Hive heavily relies on lazy deserialization and data columns go through a layer of object inspectors that identify column type, deserialize data and determine appropriate expression routines in the inner loop. These layers of virtual method calls further slow down the processing.

This work will add support for vectorized query execution to Hive, where, instead of individual rows, batches of about a thousand rows at a time are processed. Each column in the batch is represented as a vector of a primitive data type. The inner loop of execution scans these vectors very fast, avoiding method calls, deserialization, unnecessary if-then-else, etc. This substantially reduces CPU time used, and gives excellent instructions per cycle (i.e. improved processor pipeline utilization). See the attached design specification for more details.


    Issue Links


HIVE-5584 Write initial user documentation for vectorized query on Hive Wiki

  • Resolved
relates to

HIVE-10179 Optimization for SIMD instructions in Hive

  • Open


Jitendra Nath Pandey added a comment - 13/Mar/13 19:57

This will be an incremental work in multiple phases with no regression on current system. We will publish a design/scope document very soon.
The main idea behind the proposal is to transform the execution engine to process a row batch at a time instead of a single row. The row batch will consist of column vectors and each operator will process the whole column vector at a time. The column vector will consist of array(s) of primitive types as far as possible.
The expressions will be implemented for various data types using pre-compiled templates. The appropriate expressions will be added to the operators based on data types.
A vectorized iterator interface will be implemented by the file formats to provide vectorized input to the operator tree.

Jitendra Nath Pandey added a comment - 13/Mar/13 19:58
Eric Hanson added a comment - 14/Mar/13 00:41

This is part of the Stinger initiative. http://hortonworks.com/blog/100x-faster-hive/

Jitendra Nath Pandey added a comment - 18/Mar/13 23:40

The attached document covers the outline of the design. Any comments/feedback are welcome. We will keep updating the document with more details as we include more data types, operators and expressions. We will also include the vectorized iterator design into the document.

Eric Hanson added a comment - 06/Apr/13 01:00

Added section on requirements for implementation of vectorized iterator, with respect to how to load VectorizedRowBatch object on each call to next().

Steve Loughran added a comment - 10/Apr/13 12:23

We couldn't have a copy of the doc in PDF stuck up at the same time as the editable one could we?

Eric Hanson added a comment - 10/Apr/13 18:22

Fixed a bug in example, plus made minor wording changes in introduction.

Eric Hanson added a comment - 10/Apr/13 18:29

Adding pdf of design doc per request.

Eric Hanson added a comment - 10/Apr/13 18:29

updated version # and date

Eric Hanson added a comment - 09/May/13 22:31

Updated design document with discussion of precise handling and interpretation of all-non-null (noNulls) and all identical (isRepeating) column vectors.

Also included discussion of TIMESTAMP internal vector representation as long integer number of nonseconds since the epoch.

Eric Hanson added a comment - 10/May/13 00:28

The code for this work is currently in the "vectorization" branch of the public Hive repo.

Eric Hanson added a comment - 13/May/13 18:59

Added discussion of timestamp values before the epoch (in 1970) related to HIVE-4525.

Eric Hanson added a comment - 28/May/13 22:52

Updated design spec with new section by Remus Rusanu about vectorized group-by/aggregate. I edited it a little bit and added the final paragraph on future considerations.

Dmitriy V. Ryaboy added a comment - 03/Jul/13 23:15

Hi folks,
What an incredible amount of work! Looks fantastic, looking forward to this.

It seems like the general idea of a vectorized operator is not Hive-specific. Is there any possibility of abstracting the core logic of an operator that can efficiently process a stream of data, such as what you get from ORCFile, and return the computed results?

Having such a library be available independently of Hive would allow reuse in other Hadoop ecosystem projects (Pig, Cascading, Drill, etc) without the need to reinvent the wheel, and would also bring the whole community behind optimizing one set of operators instead of continuing the existing fragmented state of the world.

The process of separating out such a library might also yield benefits in terms of winding up with a cleaner design and better abstractions (that's been my experience when going through similar exercises on other projects – I don't have any reason to think your current design is not clean or doesn't have good abstractions).

Do you have any thoughts on how this could be achieved? Does this sound like something you would be interested in? Is there something that people currently working on other projects can do to help this become a reality?

Vinod Kumar Vavilapalli added a comment - 04/Jul/13 07:28

A huge +1 to that. Having a common set of operators will be a huge win. That said, I already see that the current branch follows Hive's operator base classes, uses HiveConf etc. I believe with little effort, this can be cleaned and pulled apart into one separate maven module that everyone can use.

Some points to think about:

  • The target location of the module. The dependency graph can become un-wieldly.
  • Given the use of base Operator, OperatorDesc etc from Hive, if at all there is interest and commitment, we should do this ASAP when we only have a handful of operators.
  • Make one other project demonstrate how it can be reused across ecosystem projects, PIG will be great - just a few operators will be a great start


Eric Hanson added a comment - 08/Jul/13 21:43

Dmitry and Vinod,

What specifically do you want to do with the code once it is factored out?


Dmitriy V. Ryaboy added a comment - 08/Jul/13 22:05

I would like to provide the same vectorization benefits to Pig and similar frameworks (possibly Cascading, and maybe the Spark or Crunch guys will want to use this as well, etc).

Jitendra Nath Pandey added a comment - 11/Jul/13 18:33

Dmitry, Vinod
There is significant amount of vectorization work in expression evaluation for example, arithmetic expressions or logical expressions or aggregations etc. Many of these expressions are pretty generic and different systems are likely to have similar semantics for these. It should be possible to re-use this code with little change in pig or other systems. It will be required to use same vectorized representation of data in the processing engine to re-use these expressions, but that part of code is also generic and re-usable. I think that could be a good starting point.
However, a bunch of the vectorization work is in operator code where we have vectorized version of the hive operators. These operators are closely tied with hive semantics and implementation. Therefore, it will need some restructuring in hive code base as well to generalize these operators for re-use in other projects. Also, at this point we should be thinking more generally about a common physical layer shared between pig and hive. These languages can continue to have different logical plans but it would be desirable that they share common physical plan structure because they both use same map-reduce runtime.

Dmitriy V. Ryaboy added a comment - 11/Jul/13 20:30

I believe physical plan primitives for both Hive and Pig (and potentially others) are going to come in via Tez, as both Pig and Hive want to get off strict MR in the long-term.

I'll take a crack at extracting what's extractable. Right now Hive's UDAF reaches fairly deeply into this code, as you noted, but I think with a little restructuring this can be factored out.

Eric Hanson added a comment - 17/Sep/13 23:16

Updated design specification with new section describing the vectorized UDF adaptor (HIVE-4961).

Jitendra Nath Pandey added a comment - 01/Oct/13 18:01

Vectorization work has been committed to trunk. Going forward, all the vectorization work will happen on trunk and vectorization branch will be obsolete.

Lars Francke added a comment - 04/Oct/13 11:26

This is a huge patch and it's hard to see if it changes anything for the end user. As we'd like to keep the Wiki up-to-date it'd be great if someone could comment whether there are any configuration options besides hive.vectorized.execution.enabled or any other things that should be documented.


Eric Hanson added a comment - 04/Oct/13 16:19

I've been planning to write some user documentation for this feature. Where do you think would be a good spot in the wiki to include it?

Lefty Leverenz added a comment - 05/Oct/13 10:47

Put it in Design Docs (https://cwiki.apache.org/confluence/display/Hive/DesignDocs) until it's released. Later you can move it into the User Docs with a note about which release introduces it. You can either change the file's location in the hierarchy or leave it in place and just link to it from the User Docs section.

When it goes into User Docs, you have some choices. Does it belong on the Home page or in the Language Manual? If in the Language Manual, do you want it under DML or should it be a stand-alone doc? That depends on what you write and how you want readers to find the doc. You can always add links from other docs to make sure people find it.

Here's the Language Manual: https://cwiki.apache.org/confluence/display/Hive/LanguageManual.

Of course configuration goes here, perhaps in a subsection under Query Execution: https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties. I suggest you make a section in your design doc that's formatted to match the configuration doc, so when the time comes you can just cut & paste.

Eric Hanson added a comment - 01/Nov/13 17:45


