Posts

Showing posts with the label HDFS Command

What is Apache Hive

Image
What is Apache Hive? Hadoop is like sea with a lot of tools and technologies that are making our job done. The Hive is one of those technology. Actually hive running on top of the Hadoop. Apache Hive is a Hadoop component that is basically developed for data analysts. Even though Apache Pig can also be developed for the same purpose, Hive is used more by researchers and programmers. It is an open-source data warehousing system, which is exclusively used to query and analyze huge volume of datasets stored in Hadoop HDFS . Hive supports for data query, data summarization and data analysis. HiveQL is the query language in Hive. This language translates SQL-like queries into MapReduce jobs for deploying them on Hadoop. Hive providing shell where we can perform basic operation which is supported by Hive. If we run HiveQL in hive shell, it will call MapReduce job internally and get back the result. Hive has the schema flexibility and data serialisation and serialisation. Advantage of...

HDFS Commands Part - II

Image
In part - I session we learned about HDFS basic commands, in this session will see the intermediate level commands. Before read this article I suggest you to learn basic hdfs commands.   Commands 1. copyFromLocal This HDFS command is similar to put command, but the source is restricted to a local file reference.      Usages: hdfs dfs -copyFromLocal <local_path> <hdfs_path>      Example: hdfs dfs -copyFromLocal /home/user/Desktop/file.orc /dir_1/ 2. copyToLocal This HDFS command will copy file/directory from HDFS to local file system.      Usages: hdfs dfs -copyToLocal <hdfs_path <local_path>      Example: hdfs dfs -copyToLocal /dir_1/file.orc /home/user/Desktop/ 3. text This HDFS command will take the source file and display the file content in text formad.      Usages: hdfs dfs -text <hdfs_file_path>   ...