Big Data

Big data

 

What is Data?

Data is an unprocessed or raw format information. It can be any character,  text, numbers, images, audio, or video.

What is Big Data?

Big data is a term that describes the large amount of data. It’s used to process huge and complex data. Data that is structured, unstructured, semi-structured and very large cannot be processed by relational database engines on given time. Moreover, The data will be growing exponentially based on time. This type of data called "big data".

Categories Of Big Data

  1. Structured

    Structured data refers to any data that fixed fields and records for example RDBMS and csv files.

    Structured data
    Example of Structured data
  2. Unstructured

    Unstructured data refers to any data that does not have predefined format, for example machine generated logs and web page. 

    UnStructured data
    Example of Unstructured data

  3. Semi-Structured

    Semi-Structured data refers to any data that would be a raw data or typed data in a conventional database system for example JSON, XML.

    Semi-Structured data
    Example of Semi-Structured data
     

  4V's of Big Data

  1. Volume 

    Earlier human created data, Today data generated by machines, networks and social media. Technologies have capable of store this, more data. 

  2. Velocity

    Data is coming at extremely high speed and it will be real-time. Velocity also describes how fast data are processed and analyzed.
  3. Variety

    Data comes in different formats. Like numeric, email, video, audio, stock ticker data and financial transactions. 
  4. Veracity

    Basically the quality of data, not all data is always good. That’s why the system should be able to extract good data instead of saving both the good and bad data.

 Why Big Data?

  1. Monitoring -  To watch daily activities and prevent the Vulnerabilities. 
  2. Alert - To send the alert. If any vulnerability or data theft occurred.
  3. Fraud Detection - To find crime on our organization and etc..

  Who Uses Big Data?

  1. Education - To find the students and staff's progress.
  2. Health Care -In analyzing hospital activity and protect patients Identity.
  3. Banking -To detect fraud and Security issues.
  4. Retail and Manufacturing - To analyze sales growth, etc.. 


                                                                                    
                                                                                                                                                                                 Next >>

Comments

Popular posts from this blog

HDFS Commands Part - I

HDFS Commands Part - II

Install Hadoop On Ubuntu