Tuesday, 19 July 2016



Big Data In a Nutshell

 Introduction 
 
Presently the importance of big data is being realised very slowly. Big data is complex data for which advanced methods are required to get a certain value.  Here the size of data is such that it goes beyond the ability of the software tools to capture, process and calculate the data.
 As it is fluctuates every moment   accuracy plays a very important role. Better accuracy can lead to good decisions. As a result of which growth will take place. I have summarised the definition of big data as per various research papers.



        

In 2010, Apache Hadoop defined big data as “data sets which could not be captured , managed and processed by ganaral computred within ageneral computer with acceptable scope”. Here Apache tries to tell that the complex data which cannot be maintained with the help of other softwares .
 This requires a high level technique and technology. The Gartner group defines big data as three dimensional data growth challenges and opportunities as the 3v’s.
(1)   Increasing volume – This directly says to increase the volume of the data. This data does not sample. It directly grows and observers the track and what happens to this data.

(2)   Variety – Variety refers to the range of the big data. It is very important to understand the variety of big data. It tells us that from where the big data will be drawn.
  Nowadays the data is growing very rapidly. Currently the World’s per capita capacity to store data is being doubled every 40 months. This implies that every 40th month we require a new technology.  2.5 exa bytes  of data is being created every day. This data is gathered from various  software logs, cameras, microphones, radio frequency identification devices(RFID), mobile devices.
With big data many important things of the field of technology are related. Things such as cloud computing and big data play a very vital role for the advancement of technology. Big data also has certain challenges which are needed to overcome.
Big data is one field which directly relates to decision making. Simply with the help of charts, graphs we could lead to some excellent decisions.

 
 History 
 
This started seven decade ago. Earlier this was refereed as “information explosion”. This was the term firstly used in the oxford dictionary. Our increasing ability to store and analyze data has been a gradual evolution. The main evolution of big data started at the end of the last century and this took place because of the invention of digital storage and internet.
 As the usage of the term big data has increased now it all began with the literature. Various novels, articles were written for the better understanding of this term. In 2008, it was estimated that 14.7 exabytes of information was produced. According to the reports the data was going on increasing.
 In 2014, the rise of mobile machines took place. The people started to use mobile devices to access digital data. Now big data analytics is becoming a top priority for the business. Currently big data is not a new phenomenon but one that had a long evolution of capturing and using data. Big data is also laying the foundations on which many evolutions will be built.


Four Layers 

There are four layers in Big data. Those are as follows:-
(1)   Data source layer:-
This is the first layer of the Big data. In this layer the arrival of data takes place. For this we first need to analyse whatever we have. Then the next step is to find out what do we need to answer. For this analysis of question is very important.  This helps to establish new sources for data.



(2)   Data storage layer:-
After collecting data from the first layer. The next step begins. In this step volume of data enterprises begins to generate and the storage starts to explode. For smaller data sets all that is required is a bigger hard disk.
 Now when you move on to huge data the requirement of file system comes. You must have a system that understands the file system and that can handle the database that is being generated.


Depending on the amount of data you are storing you need to make a decision of what are your security and privacy requirements

(3)   Data processing layer:-
As the name suggests the analysis of data takes place in this layer. This is the most crucial layer of the big data. This layer enables to reach out to a particular solution.
could be done by preparing charts, graphs from the analysed data. Presenting the data as simple as it could be is the key feature which allows to take quick and right decisions.


Technology

Aim:-  Real or Real time delivery of information.
 For handling data various technologies were used such as Relational Database Management system (RDBMS), Dekstop statistics and visualization packages . But these fail when big data comes in to act. Users of big data prefer direct attached storage (DAS). This also has many forms like Solid State Drive (SSD). 




With the help of this the capacity of the SATA disk increases which is buried inside parallel processing nodes.  While using technology it must be assured that latency is kept in mind.  Wherever possible it is tried that latency could be avoided.  Advancement of technology in big data is very crucial. Proper advancement in this could lead to exact conclusions. By producing exact conclusions one could easily predict the trends which the market is following and could be very helpful in predicting the market.
Various technologies such as these are used in handling various big data :-
(1)   A/B Testing
(2)   Machine learning
(3)   Natural Language Processing(NLP)
(4)   Cloud Computing
(5)   Business Intelligence
(6)   Charts
(7)   Graphs


Applications

Cloud Computing :-


Cloud Computing is the delivery of computing services over the internet. Cloud Computing allows the users to access software and hardware that are accessed by third parties at remote location.

 It is a model for enabling convenient, on demand network access to shared pool of configurable  computing resources that can be rapidly provisioned and released with minimal management efforts on service provider interaction.










Cloud model consists of five characteristics:-

(1)   On demand self service:-
A customer can unilaterally provision computing capabilities such as server time and new storage, add needed automatically without automatically without requiring human access with each service provider.
(2)   Broad new access:-
The capabilities are available over the network and accessed through standard mechanism that promotes use by heterogeneous thin ir thick client platform such as mobile phone, tablets, laptop etc)
(3)   Resource pooling:-
 The provider computing services are pooled to drive multiple consumer using a multi-tenant model with different physical and virtually resources which are dynamically assigned and redesigned according to the consumer demand.
(4)   Rapid Elasticity:-
 Here this can be easily provisioned and released. In some cases it can take place automatically also. This is done to move up the inward and outward commensurate with demand.
(5)   Measured services:-
This system automatically controls and optimizes resource. This is done by leveraging a metering capability at the same level of abstraction , It is appropriate to the types of services.
 

Relationship between cloud computing and Big data

The development of cloud computing can directly lead to the solutions of the challenges Big data is facing. It very crucial to enhance the development of cloud computing. With the help of cloud computing the storage issue can be solved. This is one of the biggest issue which big data is facing. Another key thing which affects the big data is distributed storage. This can effectively manage big data.

 Cloud computing mainly relates to affect the architecture of the IT industry whereas big data plays a vital role in decision making.  They both are indirectly connected. Therefore, the development of both the things could lead in further advancement and enhancement in the field of technology.



Relationship between Iot and Big Data

In Iot huge amount of sensors are fixed into various devices and machines in the real world. The sensors which are fixed in this devices and machines produce a huge amount of data. This data is being produced from many fields. This could be environmental data, transport data and many others.

Now this huge amount of data generated could be referred to as Big Data. This has its own characterstics. This data could be structured, unstructured or other. It needs to be analysed. For analysis certain graphs, charts are prepared to reach to a certain solution or a conclusion. This conclusion could be helpful in bringing a solution to a problem.

Currently the data processing capacity of IoT has gone down. It is very important to introduce new technology in this field which could lead to the development and hence produce good conclusions which would ultimately give a good solution to a certain problem.
 
 

No comments:

Post a Comment