Introduction
Turkey’s heavily diverse culture inherits elements from Eastern Mediterranean and Central Asian regions, as well as Eastern European and Caucasian traditions. This can be attributed to its unique geographical positioning, lying partly in Asia and partly in Europe. Owing to the same feature, it also becomes inherently vulnerable to earthquakes. The country sits between two huge tectonic plates, Eurasia and Africa/Arabia, which are inexorably grinding into one another. Consequentially, it suffers frequent quakes.
I found a dataset on the earthquakes that occurred in Turkey between 1910 and 2017, which I used for this project.
Tools and Methods
I collected my dataset from Kaggle, an online public data platform.
https://www.kaggle.com/caganseval/earthquake/downloads/earthquake.zip/4
The earthquakes in Turkey have been recorded my several organisations, and a university-based earthquake research centre gathered the data. They prepared this dataset with records of multiple attributes : id, date, time, latitude, longitude, country, city, area, direction, distance, depth, xm, md, richter, mw, ms, mb.
2. Openrefine
The dataset comprised of 24007 records. It registered the magnitude of the earthquakes under 6 columns, namely xm, md, richter, mw, ms, mb. All of these 6 are values that represent different scales and units of quantifying earthquake intensity. I excluded all but Richter which is considered the most commonly used.
I used Openrefine, an open source desktop application for data wrangling, to clean the dataset.
A few other things that I revised include the dates, which I clustered into years(more precisely decades), and the blank records, which I removed from the dataset.
The time somehow consistently reflected an hour around midnight. I omitted or clustered all unchanging values. This brought down my record figure to 10062.
3. Numbers
I used this spreadsheet application to further clean the dataset. I grouped all the records under same year range into one. The values cannot be grouped so I used the average to represent the direction, distance, depth, and intensity. This significantly compressed the data.
4. Tableau
I used this interactive data visualisation software to make graphical representations of the data. The software allows user to design charts and graphs based on datasets. Tableau offers functions for mapping, plotting latitude and longitude coordinates, choice of color and type of graphs, and much more. Preparing these graphs revealed certain facts and patterns in the data.
Findings
According to the time stamps, in a very unlikely fashion, all the earthquakes occurred around 12:00 AM. Possibly the research centre chose to record only the ones that occurred in that hour.
https://public.tableau.com/profile/agreya#!/vizhome/EarthquakesinTurkey1910-2017/Dashboard1?publish=yes
I represented the count and intensity of the quakes in the same graph to make apparent their contrasting trends. The frequency of the tremors has clearly increased over the years but their magnitude has dropped considerably. Except, the years between 2000-2017 when it appears to rise again. My choice of colors aims to depict the same : scarlet for the rising and blue for the dropping values.
This chart shows that the patterns of intensity, depth and distance are very similar. When the recorded distance was the largest at 20.70 km, the recorded depth was also the highest at 30.00, and so was the intensity at 4.90. While the intensity remained constant till it dropped to nil between 1960-69, depth and distance values kept consistently dropping till they also reached their lowest in 1970-79.
I used a pie chart to present the average direction of the quakes. It seems like most of the earthquakes spanned across the north-west. While 5,406 earthquakes were found to be directed towards the north-west, only 141 headed west. Second highest record was made in the south-west direction with 3,276, and thereafter the third in south-east with 1,239.
Reflection
The softwares and applications that I learned during this project greatly simplified the process of data cleaning and transforming. They offer many other features which I haven’t used in this project. I would like to explore them .
I played around with a lot of different datasets before I chose this one. There were a few I found rather challenging, and I am going to continue working on simplifying those.
Values like ‘cities’, ‘longitudes’, and ‘latitudes’ are very broad and I would like to work representing them graphically in my future projects.