Posts

Showing posts with the label HDFS

What is HDFS

Image
What is HDFS HDFS (Hadoop Distributed File System) is a file system like our normal desktop/laptop file system which is used to store the data. It's specially designed for storing huge datasets with cluster of commodity hardware and with streaming access pattern.   The data may be text file, image, audio, video, etc... Streaming access pattern Streaming access pattern means write once read many number of time but don't change content of the file is called as streaming access pattern. Operations in HDFS Write Operation Read Operation   Write Operation Assume that you are writing file into HDFS. Your write request will go NameNode (NN) Distributed File System (DFS). The DFS will make RPC call to the namenode for create new file. Before creating file the namenode will do couple of things. It will check whether the file is not exist and user has permission to create new file. Once all the check is done successfully the namenode will provide a...