Posts

Showing posts from August, 2020

What is Apache Hive

Image
What is Apache Hive? Hadoop is like sea with a lot of tools and technologies that are making our job done. The Hive is one of those technology. Actually hive running on top of the Hadoop. Apache Hive is a Hadoop component that is basically developed for data analysts. Even though Apache Pig can also be developed for the same purpose, Hive is used more by researchers and programmers. It is an open-source data warehousing system, which is exclusively used to query and analyze huge volume of datasets stored in Hadoop HDFS . Hive supports for data query, data summarization and data analysis. HiveQL is the query language in Hive. This language translates SQL-like queries into MapReduce jobs for deploying them on Hadoop. Hive providing shell where we can perform basic operation which is supported by Hive. If we run HiveQL in hive shell, it will call MapReduce job internally and get back the result. Hive has the schema flexibility and data serialisation and serialisation. Advantage of Apa