Apache Hive is a data warehouse system built on top of Hadoop to query Big Data. Hive originated at Facebook and was open sourced in August 2008. The challenge Facebook had to address is one faced by many companies since then. Eventually data growth in a company challenges the capabilities of deployed RDBMS or NoSQL systems. Reports and analytics start to take minutes, then hours, and eventually overlap with other queries and the whole system grinds to a halt. Another common scenario company’s start processing big data with Hadoop discovers the value of making the data accessible beyond the development team capable of writing complex map-reduce jobs.
- Basic familiarity with SQL and/or a scripting language
- No pre-existing knowledge of Hadoop is required