A BRIEF HISTORY OF BIG DATA


Timelines, Visualization

Introduction

Big data analytics has been a topical area in the fourth industrial revolution era. The term Big Data has been traced to the mid-1990s. John Mashey defined this term in handling and analysis of massive unstructured datasets. Later in 2000s, 3 dominating characteristics were given to the big data by Doug Laney, including volume, velocity, and variety. Nowadays, big data is not only about how much data we have, but more importantly, by analyzing big data, regular patterns could be discovered through it, which contributes to improving efficiencies, optimizing development, and, as a result, creating more benefits. In this fast-changing world, big data still has a lot of growth left to do. Expect significant advancements in big data and analytics to happen at a faster clip, technologies will vastly improve our ability to store, process, and analyze data in the future.

Materials and Tools

In this project, Timeline JS is used to generate the timeline. Timeline JS is an open-source tool created by Northwestern University Knight Lab for creating interactive and creative timeline, which presents events chronologically along a line. At the beginning, I thought this tool related to JavaScript and coding skills might be required, as the name suggests. However, it turns out to be much easier than I thought. The timeline will be generated automatically after inputting the research content into a specific google spreadsheet. Even those without a programming background can easily understand this tool.

Google Spreadsheet

Process

From previous student works posted can see the importance of an organized timeline together with effective visual assets. After researching the history of big data, I collected several events from 18,000 BCE to today in big data history and transferred them into the google sheet that Timeline JS provided. Instead of transferring all the time spots into the google sheet, I only picked 7 representative events.

Challenges

At the beginning, I thought that there was a limitation in the timeline to present ancient events. The “Year” column only accepts numbers, so I failed when I tried to input events back from Before the Common Era (BCE) to the timeline. My first event about ancient people using data in counting trading activities, which tracks back to 18,000 BCE, cannot be put into the timeline since the year contains “BCE”. But then, I figured out that a negative value in the “Year” column help presenting date back from BCE. By inputting “-18,000” in the “Year” column and “18,000 BCE” in the “Display Date” Column, the first event is available in the timeline.

I picked one media image and one background image for each event in the timeline to present information in an engaging and appealing way. Specifically, images in a dark environment were chosen as the background images so that they would not interrupt the text above. Under the media image, I put media capital left and credit right in the first iteration of the timeline. However, I realized that having both capital and credit was confusing and redundant. So, in the final version, I deleted the credit and only used capital to help explain the image.

What is more, after review and critique, I discovered that under “title” type, a cover page could be added into the timeline in the beginning part. On this page, no year information is required, and the title is in a larger font size which differs from all the other pages. I put in a title and a brief description about big data. By adding a cover, this big data timeline project is more completed and refined.

Future Direction

Although the present Timeline JS is a relatively easy tool to use, there is still scope for improvement in better experience and a more concise and understandable interface. Instead of using an extra google spreadsheet, it is more convenient if the google spreadsheet and the Timeline JS tool can be combined in one. Also, a detailed instruction could be provided to fix usual issues.

Result

A Preview of Timeline

https://cdn.knightlab.com/libs/timeline3/latest/embed/index.html?source=1u2RXgz-LNg0Ouo5Q3YGAtPfwU_u6N5v2iK6q-Ml2Z6k&font=Default&lang=en&initial_zoom=2&height=650

Conclusion

Overall, this is an interesting project, contributing to a better understanding of information visualization and history of big data. Since 18,000 BCE, the importance of data has been realized by ancient people in counting trading activities. Nowadays, big data analytics has become a trend in various fields, including business and science. Through the overview of the timeline, we can see an increasing velocity of big data analytics. The ability to store and analyze information has been a gradual evolution, and since last century, this ability accelerated as a result of the development of digital storage and the internet. In the future, more events could be included in the timeline for a more detailed history overview. Compared with the text document, the timeline is a better and clearer medium presenting the evolution. As a great tool for beginners to create timelines, Timeline JS has easy functions assisting creators to add interactive visual asset in an organized hierarchy and in a chronological manner.

Resources

https://journals.sagepub.com/doi/pdf/10.1177/2053951716631130#:~:text=In%202001%2C%20Doug%20Laney%20detailed,%E2%80%A2

https://doi.org/10.1016/j.jbusres.2022.113525

https://home.cern/science/computing/birth-web/short-history-web

https://link.springer.com/chapter/10.1007/978-3-030-16272-6_6

https://studentwork.prattsi.org/infovis/archive/labs/timelines/