Social Media Data Analysis Guide for Beginners

Yogyakarta, June 24th 2021━The Institute for Policy Development UGM held the “Beginner’s Guide for Social Media Data Analysis” event on Thursday (24/6). This event presented Idin Virgi Sabilah, Junior Researcher Institute for Policy Development, as a speaker. The event took place at 09.00 a.m. to 10.00 a.m. via Zoom Meeting.

The use of data on social media as the main source in social humanities research is currently starting to be widely used. Interactions in the digital world leave a lot of traces in the form of data where the data can be practically analyzed for research purposes. In general, the stages of data analysis on social media consist of capture, understanding, and presenting. First, data is collected from various sources while we extract information related to the data. Then, carry out further analysis of the available data by sorting the data as expected. After the data is analyzed, the last step is to present the findings.

Idin explained that several social media platforms were equipped with data collection tools. However, data retrieval is usually limited to a certain time.

“The big websites have their own tools for those who want to collect their data, they provide developer portals such as Twitter analytics, Facebook analytics, Instagram business tools, Youtube creator academy, and so on. We can analyze from there, but there are shortcomings, such as limited time in the last week or the last few months, it cannot be years ago,” he said.

In addition to using these tools, there is a very popular third-party application for collecting data, such as Python. Python is more versatile to search for any data, even though we can use it not only for research, but also can be used for web development, text processing, AI, machine learning, and so on.

Python’s versatility also means that there are a wide variety of libraries that make Python more useful for certain purposes, including research.

Python is only an introduction, we still have to use other tools to start the data analysis process, for example using Tweepy. Tweepy can be integrated with Python, then we can enter the coding that is already available in the web installation tools. After that, to analyze or visualize the data, we can use Flourish or Tableau to present the data with various choices of data visualization forms.

However, in some studies, data visualization is not the final process, when it is collected and analyzed, sometimes we still need calculations or statistics using Stata or SPSS.

“So, yesterday I collected data from Twitter, then analyzed it using Flourish, after that I calculated it using Stata, so that the research was produced as I expected,” Idin said.