The Next Big Things: Big Data And Big Revolution

Dr. Jayarama Reddy
Dr. Jayarama Reddy, , Professor, St. Joseph's College (Autonomous)
Around the world and across all fast-paced platforms, 'Big Data' is the buzz word now and it's a potential money-spinner too. People are making money from data and data is being traded now. Internet is the biggest mode and means of creating big data. There are more than 5 billion internet users in the world now. There is not just an increase in the number of internet users but the diversity of usage is also rapidly going up. The currently trending 'Social Media' aided by the RealTimeApps (RTA) is really mammoth, which adds about 550 new users each minute, making up 0.3 billion novel users every year. The world is adding more than 1.2 million new data producing social media users each day. YouTube, Snap Chat, Instagram, Tik Tok, Facebook, LinkedIn, Twitter, Research-Gate and other such real-time apps engender colossal volumes of user data which is unstructured and not easy to stock up. The accumulation of data is also rapidly growing in bioinformatics, cheminformatics, geoinformatics and health care systems. Biological mega-databases like NCBI, EMBL, PDB, ExASY, DDBJ, DrugBank and others are also big accumulators of data. Cloud Computing is also generating data at an unprecedented rate from heterogeneous sources like social media, business, marketing, government, scientific research, medicine and health. This necessitates diligent and insightful indulgence of big data analytics in order to develop tools and technologies. Big Data technologies include several system layers at Data Storage, Data Processing, Data Querying, Data Access and Management. We also need to be ready for potential challenges and future scope. Google created the Google File System (GFS) architecture and MapReduce programs. These two increased the scalability of data in a distributed fashion. It also parallelized data processing for unlimited data. It led to the inundation of data sources from diverse domains creating a great collection of structured data. At present RTAs of social media, sensors networks, digital transactions, stock management, blog posts and internet traffic produce incredible datasets. Data mining approaches are essential to fish out economically beneficial configurations and to extract valuable information concealed in such vast datasets. A famed technocrat, Jim Gray named this period of data inundation as, 'the fourth paradigm'.

"In about four years, there is more than 300 percent growth seen in YouTube, which is the biggest public platform for hosting videos from 2014"

The world population at present is living in a 'data deluge era'. It is evident by the continuous accumulation of large volumes of data from various sources. For example, the newest figures tell us that Google was handling 3.5 billion requests per day and generating 2.5 quintillion bytes of data in 2015. In 2019 this has gone up almost 100 percent. Egan Marino Corporation (EMC) study revealed that worldwide data quantity would arrive at 40,000 Exabyte by 2020, a 100 percent increase in every 24 months. This gigantic data detonation is widely known as 'Big Data' to emphasize the nature of its rapid and widespread insinuation on technology and society as a whole.

In about four years, there is more than 300 percent growth seen in YouTube, which is the biggest public platform for hosting videos from 2014. Slapdash, as well as professional up loaders, have been depositing 400 hours of new video each minute of every day. Netizens are enjoying about 4.5 million videos every minute on YouTube. Another similar entity, Instagram is also catching up, to which about 100 million photos and videos are being added every day. The number of posts received each day by Instagram is more than 70 million. Another famed social media portal, Facebook accounted for three million posts per minute in 2016. This app records nearly six million likes/day. The like button on Facebook has been pressed 13 trillion times so far. Now, in 2019 there are over 2 billion monthly active Facebook users, compared to 1.44 billion in 2015. Users posted 4.3 million messages posted every day in the world. Twitter has become a craze across all sections of society. It has grown by 58 percent since 2013. In 2019, per minute people made about 48,000 tweets and that makes 682 million tweets per day. According to the Radicati Report (2019-23), about 293 billion emails are sent daily in 2019, which is expected to be 347 billion emails in 2023. Now, in 2019 there are 3.9 billion email accounts, which will become 4.4 billion in 2023. The number of SMSs generated every minute in the world is said to be about 0.1 billion. In the United States alone 26 billion text messages were sent each day in 2017. That makes 94 messages/American/day. The number of Mobile phone users is also galloping at a rapid pace resulting in the build-up of an enormous amount of data. In 2014 mobile phone users have uploaded and downloaded around 2 Exabyte's of data. Within about two years, the data created by these devices increased by 400 percent to 8 Exabyte's. In 2019 there are more than 5 billion mobile internet users in the world. Because of the easy accessibility of mobile technologies on Android and IOS platforms, more than 67 percent of the world population of this planet are using the internet now. Mobile phones, Televisions, Cars, Airplanes and other Internet of Things are a massive source of data. These things altogether create 2.5 quintillion bytes of data/ day. For example, a Turbo Fan Engine of Pratt and Whitney creates 10 Gigabytes of data per second.

According to the International Data Corporation (IDC), Big Data is considered a new era of architecture and technologies used to extract economic value from a very large volume of diverse forms of data with high velocity. IDC reports that 80 percent of worldwide data produced is unstructured, hence the role of Big Data analytics has become apparently indispensable. It is like discovering the needle of intuition in the haystack of massive unstructured data. The volume of datasets actually differentiates big data from conventional data. Datasets in Big Data are beyond the capability of traditional database software tools that store, manage and analyze data. Structured, semi-structured and unstructured are the three editions of Big Data. On the other hand, traditional data is usually structured which is easy to manage, store and analyze.

These datasets have to be essentially analyzed to capitalize on their content and value. Big Data analytics is known for its veracity, verbosity and velocity to take care of the rapidity of production by the so called RTAs. However, traditional datasets are generated in batch mode and hence they do not entail speedy analysis. For small datasets, the data storage medium required is RDBMS. Specialized architecture such as NoSQL and Hadoop distributed file system (HDFS) are indispensable for storage in case of big datasets. Data integration is achieved much easier for small datasets as compared to big data. Big datasets required to be analysed efficiently by means of data mining algorithms to haul out knowledge and expand profitable outcomes. Micro Focus Control Point has been integrated into the devices in order to manage, control and protect the unstructured data. It is helpful in the safeguarding of treasured technological, political, personal and business data.

Always a new generation of computing tools is required to store, process and visualize as there is a data explosion. With the present development trends, data accumulated in organizations will definitely reach Exabyte magnitude soon. This has led to organizations such as Google, Microsoft, Facebook, Amazon, YouTube and Infosys to develop their own Big Data platforms. This momentum of big data has facilitated the launch of 'Big Data Initiative' by India also to support Big Data analytics.

According Cisco's predictions internet traffic will exceed 2.3 zettabyte by 2020. Big Data analytics has affected a balanced speed for sustenance in bioinformatics and other information-based technologies. There is a need to integrate data management technology and networking technology to provide advanced data management and analysis techniques. Data analytics basically includes granular computing, algorithms, data mining, data visualization, pattern recognition, statistical analysis, deep learning and machine learning. This technology will be commonly useful in all sorts of disciplines across biology, politics, business, and management. Hence there is a strong obligation to design programmes to introduce these courses. As we need to develop and estab-lish technologies and when they are required. This untiring endeavour is a never-ending sojourn of technologists. When we design a new technology or tool or application, we cannot predict how long it will last. It will be in vogue, use and demand only till a new one and better one arrives. Educational institutions, companies and industries are becoming more and more cognizant that data analytics is progressively becoming a vivacious factor to be economically viable and to realize personal and societal visions. Novel applications will be of great use in the current scenario of uncertainties. Global warming prediction, climate change prediction, weather prediction, earthquake prediction, disease prediction, crop yield prediction, financial distress prediction, dynamic and financial distress prediction can easily be made using these tools.

Thus Big Data analytics, data science and bioinformatics are the subjects that will be in big demand now and in the near future. Educational institutions and Universities have sluggishly started graduate and postgraduates programs of these subjects. But the general awareness among students and educationists is far less than expected and required.

Dr. Jayarama Reddy, Professor

Dr. Jayarama Reddy completed his Post Graduation in Botany from Mysore University and joined St. Joseh's College in 1990. Since then he is working in the same college as an Associate Professor. Dr. Reddy conducted research at IIHR and IISc Bengaluru for his Ph. D. He got his Ph. D (Biotechnology of Orchids) in 2002 from Bangalore University. In 2015, National Environmental Science Academy honoured him with the 'Eminent Scientist Award-2015'. St. Joseph's College, Bangalore, presented him the 'Scroll of Honour' for his scientific achievements in 2016.

Current Issue

TheHigherEducationReview Tv