- Introduction to the visualization(s), including subject matter and goals;
This visualization is created based on my personal interests. I’m enjoying browsing the recommended Youtube Videos because the algorithm knows my interests very well. It recommends videos by my viewing history. From that point, I start to be curious about the others’ interests based on their Youtube Data.
In this project, I will explore the most popular videos in other countries. My goal is using the dashboards to display detailed key information as much as possible. Also, using the visual design to show the trending of a certain country. I selected three countries and created dashboards for each of them. Finally, I made a tree chart as a summary to show the common interests of France, Germany, and the UK.
- The process used to design visualizations, including UX research methods (participants, activities);
Software: Tableau Software (/tæbˈloʊ/ tab-LOH) is an American interactive data visualization software company founded in January 2003 by Christian Chabot, Pat Hanrahan, and Chris Stolte, in Mountain View, California. The company is currently headquartered in Seattle, Washington, United States focused on business intelligence. On August 1, 2019, Salesforce.com acquired Tableau.
The reason I choose Tableau to do my final project is the dashboard is a very interactive tool to show different dimensions of the dataset. Simplicity is a primary goal of many well-designed websites — limiting visual clutter to help users easily navigate and understand the content. Dashboards share the same goal. So it is no surprise that the rules and tools of web design offer many lessons to guide our thinking when creating dashboard interfaces. Users can explore information by manipulating different bars, points, etc. From the UX research method, I’m curious about users’ expectations according to my topic. One of my participate, Hwang, told me that she expected to see what is the most popular catalog video of each country? Who are the most popular video creators as Youtube channels? What are the most popular comments? Also, the history of the view by years.
User-centered design (UCD) or user-driven development (UDD) is a framework of processes (not restricted to interfaces or technologies) in which usability goals, user characteristics, environment, tasks and workflow of a product, service or process are given extensive attention at each stage of the design process.
The main chart is the map because the shape of the country is the best identify to give users a version. Based on the map, I floating the catalog ID and the year trends of the catalog. My intention is the map is the main part of the dashboard and the charts are arranged around it. Human eyes are watching the left corner first as natural. I will create my map sheet on the left.
- The rationale for design choices, demonstrating that the design is based on the theory and best practices of information visualization, as well as user research;
The grid system is an aid, not a guarantee. It permits a number of possible uses and each designer can look for a solution appropriate to his personal style. But one must learn how to use the grid; it is an art that requires practice. Josef Muller-Brockmann. Moreover, the gird is very important for the design layout. It based on content hierarchy. I set the map as the first hierarchy since the shape of the country directly shows who is the protagonist with the attached line charts because they both designed by catalog ID. The third hierarchy is the detailed value display. The last one is the top words in titles.
My participant Xiaoke Li said she expected to see the pie charts and bar charts. She was expected to see the trends and catalogs. The unity color representation would be clarified to visualize data. The visual display is the first appearance to the user, her suggestion was to pay attention to make the dashboard quite clear and clean which helps to understand it.
- Findings, both of the visualization and of the UX research, organized into major topics
My major topic is showing people’s YouTube video preferences in each country. There are over forty thousand of data for one country. I couldn’t show them all but I can keep the top popular videos and try to maintain their variables as much as possible. The audiences are curious about the URL of videos. Showing them all on a dashboard is difficult. Fortunately, interaction is useful at this point. By clicking on the chart, the pop-up menu will demonstrate the rest description. The pie chart should be emphasized in the center. Two of my participates are like seeing the values of different catalogs. Also, catalog ID isn’t that much. From this view, make charts based on catalog ID is my strategy.
My data includes many variables:
video_id / trending_date / title / channel_title / category_id / publish_time / tags / views / likes / dislikes / comment_count / thumbnail_link / comments_disabled / ratings_disabled / video_error_or_removed / description. This dataset includes several months (and counting) of data on daily trending YouTube videos. Data is included for the US, GB, DE, CA, and FR regions (USA, Great Britain, Germany, Canada, and France, respectively), with up to 200 listed trending videos per day. EDIT: Now includes data from RU, MX, KR, JP and IN regions (Russia, Mexico, South Korea, Japan, and India respectively) over the same time period. Each region’s data is in a separate file. Data includes the video title, channel title, publish time, tags, views, likes and dislikes, description, and comment count.
Limite: The biggest limitation of this dataset is it doesn’t include the complete dataset by years. Because of that limitation, I can’t make a prediction trending by years. It also doesn’t have all the countries. In order to find answers to my question, I will choose countries considerably. The population size is the first step I need to consider because the data compared with a similar population will be more reliable. For instance, 1000 people like music videos in India have a different meaning with 1000 people like music videos in Japan. Therefore, according to the population all over the world, I choose Russia (1.89%_percentage of the world population ), Mexico ( 1.63% ), Japan ( 1.63% ) as a group to compare their dataset. Make UK ( 0.857% ), Germany ( 1.07% ) and France ( 0.865%) as a group to analyze.
https://www.kaggle.com/datasnaek/youtube-new This data is from Youtube API.
Overall Comedy videos have the largest number of the whole trending videos in France. Music videos are the second large. There are links in the description in the bar chart so that users can directly see what is the exact most popular video in France. In the chart by “Trending by years”, the music videos have the strongest on-going popular trend from 2014 to 2018. In order to see the value of each category, by accepting the professor Chris’ suggestion. I use the bar chart to show their appearances. From that chart, people like to make comments on music videos and it got the most satisfying rate than others. Plus, the entertainment has the most number of “dislikes”. I also use R to make a text analysis to find the most frequent words in titles.
From the pie chart, I can know that England people love music a lot. It has the largest number of comment counts, dislikes, likes, and views. Same as France’s situation, entertainment is the second large catalog. From 2014 to 2018, music has the strongest on-going trend. The most frequent words also reveal people’s interests in England. I made a filter because showing all catalog id on the pie chart will creates a mess.
In Germany, people like entertainment more than music. The popularity of entertainment increased a lot from 2016-2017 and had a decreased trend from 2017-2018. On the contrary, music’s popularity increased from 2017-2018. Overall, entertainment video is the No.1 for comment count, dislikes, likes, and views. One more interesting finding is german also enjoy music very much. They didn’t have the strongest preference like England people love music more than other catalogs.
I use R to calculate the overlap number of the top 100 frequent titles and put them into the tree chart. The four colors represent how many times it appears in the CSV file. The degree is from four to one. If the words appear three times, it means people in three countries both like them. ‘Episod’ and ‘video’ have a four times appearance. ‘1, 2, 2017, 2018, 3, 4, 5, feat, ft, game, live, music, on, season, the, trailer’ have three times appearance that means those words exist on each country’s frequent words list. The words in rest light green blocks mean they are in the two countries’ top title list. People make decisions with their knowledge and culture. It’s interesting to find common humanity interests by arranging and visualizing their data.
- recommendations for revision based on those findings.
From the feedback from a class presentation, I revised my ways to show display, views, likes, dislikes, and comments. The mistake I made before, is I combine two line charts together, likes and views. When the likes show more value than views that misleading the audience with a confused logic visual display. First, I try to use a catalog as one column and show values by bar charts. However, I meet trouble because the value is so much difference that some of them are pretty high and others are pretty low. In a word, it’s not a balanced visual display. I choose catalog as x-axis and value as the y-axis. The bar charts become more intuitive.
I used to use red to create line charts. However, after the concern of culture issues, European identify red as the bad signal. From that point, I changed the color. Plus, I got suggestions during the presentation that I can use ‘edit alias’ to type in each catalog which makes my dashboard more clear.
By using tableau, I finally achieve the goal of my project. As is mentioned above, I know which catalog is the most popular in each country. The pop-up interact function even helps to deliver the URL to let the user know which specific video is favorite. I also use R to find common interest titles in three countries.
For further exploration, I would like to find a way to put three dashboards into a big one. It would be challenging because the data can be a mess, should be designed in the right way. I think I find an interesting dataset to do the project. My first obstacle is searching the map data and combines the YouTube database with it. I use open refine to create a new column and let it overlap with the map site. Obviously, open refine is a very strong tool to clean and edit files. My second difficulty is adding the pie chart on the map. By viewing the official tutorial instruction on the website, I finally can work it out. https://help.tableau.com/current/pro/desktop/en-us/maps_howto_filledpiechart.htm My third challenging is using R to calculate the frequent title words. I use text packages and libraries to extract column and count the number of words. Final export CSV file by Rstudio.
There is one weakness of tableau I don’t like. It’s I can’t edit color by RGB numbers. The software has certain colors that I can’t change. In the end, I have to compromise it by trying to make a beauty color collaboration display. If I have more time, I’d like to spend more time on the color matching. The code and logic is the most important consideration when showing data. However, the color also helps a lot. After this project, I will spend more time analyzing data visual display.