A Bibliometric Mapping of Utilization of Google Trends for Examining Stock Market Dynamics

Abstract In the last two decades, research into the use of the internet to evaluate investor interest in stock markets has exploded, fueled by a surge in interest and publishing by academic experts. The current research examines the academic literature on the role of internet web search-based inquiries in investors' investing decisions. The study emphasizes the present state-of-the-art and reveals significant gaps in the available literature on investor attention using bibliometric approaches. The study obtained research publications from the Scopus database using keyword and reference searching methods. Specifically, the study focuses on citation analysis, keyword analysis, co-authorship and bibliographic coupling to comprehensively reveal and analyze the publication contribution of the utility of google trends in the stock market activity. The work reveals prolific authors, most contributing documents, most productive countries, predominant domains and utility of google trends the for purpose of forecasting and predicting.


Introduction
To determine current trends and opinions, web search is the rich source of real time information. The public mood reflected on the internet reflect the whole of society on a large scale. There is evident penetration of the internet into human lives and this has led to growth in interest of researchers in the area. The change witnessed in the past few years in technology has given a huge amount of data and information to various fields. Work done by Kumar et al. (2021) states the need of technological adoption in all the fields for the growth of either a developed country or a developing country. In today's times the web search has emerged as the most important platform to gather information related to the behavior of individuals. A user's attention or the sentiments could easily be traced with his online search activities. Various economic agents need individual's behavior and attention related information. The search queries on google, social media posts submitted across the world by millions of people every day are undoubtedly the good indicators for the same. The work of Kumar & Ayedee (2020) deliberates on the use of social media marketing tools for the growth and expansion of small and medium enterprises.
The literature in the field of Google Trends in stock market activity acknowledges that the influence of online behavior is expanding in the investors decision making for investments in stock market. With specific focus in the field of finance, there is always a desire to predict stock market prices. A real problem can be witnessed to model the continuously changing trends in financial markets and to generate meaningful real time predictions about those significant changes. The present ongoing trend portrays the extensive usage of internet and social media platforms to represent the opinion and sentiments about each and every event. This area has emerged as a fascinating field of research as a huge amount of data is available, and a lot of exploitation and extraction is possible to generate eloquent results and new knowledge. To analyze the contribution of Google Trends in investor's sentiments and stock markets, previous studies that utilized Google Trends were referred. Google Trends was launched in 2006. The first study by Ginsberg et al. (2009) on Google Trends opened doors for further research in this domain. The study focused on prediction of spread of influenza. In 2011, Da et al. (2011) published first paper in context of investor behavior and stock market activity. Over past few years increase in internet usage and collection of unprecedented amounts of data over the web has raised a lot of academic interest towards this field.
There are numerous bibliometric studies covering wide range of issues of this domain of stock market activity and google trends. Most of the bibliometric studies have focused on broader theme of Google Trends, Big Data Analytics, and Behavioral Finance. López-Cabarcos et al. (2019) conducted a bibliometric study on investor sentiment and identified the relevance of field of research and explains that the base articles are related to traditional finance and behavioral finance. The stream is related to disposition effect as well which is another important stream of behavioral finance. The variety of concepts studied under the behavioral finance are explained in the work done by (Paule-Vianez et al., 2020). Their bibliometric study identified themes like investor sentiments, expected stock return, disposition effect and overconfidence as motor themes under the umbrella term of behavioral finance and behavioral economics. The increased interest in this field can also be contributed to the emergence of new measure of sentiment through internet. The study of Zhang et al. (2021) analyzed the emergence of big data analytics through bibliometric study and identified deep learning and visual analytics as fields with high future scope of research. Their cluster wise insight provides saturated areas like firm performance and dynamic capabilities and these areas requires revival of research with interdisciplinary approach. Valcanover et al. (2020) conducted systematic literature review for behavioral finance studies of the Journal of Behavioral Finance and highlighted the important authors, countries and papers from the field of behavioral finance. Fariska et al. (2020) reviewed literature on microblogging sentiment of investors on stock market and explored the emergence of online platforms as an important place for exchange of stock market information among investors. With advent of technology, new ways of capturing investor sentiment through social media platforms like Twitter, etc were revealed. Komalasaria et al. (2021) examined the herding behavior as a research field through bibliometric and systematic literature review and identified the relevant dimensions already studied and insights for future prospects. Literature reveals uncertainty of information in capital markets as trigger of herding behavior. Further, the researchers can explore and test market conditions that induces herding behavior as future scope of research. The herd behavior in financial market is also bibliometrically studied by Choijil et al. (2022) and finds growth of research in herding behavior after the subprime crisis. The study also supports the findings of Komalasaria et al. (2021) by identifying the future scope of research in finding explanation for existence of herd behavior. The bibliometric study on overview of volatility by Avilés-Ochoa et al. (2021) reveals estimation, forecasting and modeling as important issues associated with volatility in financial markets. The study of Su et al. (2021) reviews the literature on online public opinion through news and social media and its impact on asset prices. The bibliometric study of Abdulrasool et al. (2020) on global research trends of behavioral finance analyzes 967 Scopus database publications on behavioral finance. The study discovers US and China as countries with more publications and good international collaborations. Currently explored areas include price anomalies and noise traders. The study also identifies potential areas for future studies could be sentiment analysis and perceived value.
Prior to this review, a limited number of bibliometric literature publications had come out on specific theme of financial markets, investor behavior, investor attention and sentiment in the last two decades. Usually, broader domain is the center. Any work which specifically focuses on this area of research has not been traced. Such gaps compel to provide the roadmap for future research.

Purpose of the Study
To canvas this field of research more precisely this study aims to identify the relevant authors, journals and core of knowledge of google trends in field of stock market returns. The fundamental purpose of this study is to present the predominant state of research in the field. The study further focuses on determination of the influential work, outlining the intellectual structure of the field and to identify the gaps (Chhabra, M., & Agarwal, M. (in press);Chhabra et al., 2021;Hassan et al., 2021). The study answers the questions, how research has evolved over the years and what are the recent trends in this domain. In recent years, investor sentiment and its relationship with stock market activity has acquired great relevance and same can be witnessed with the increase in number of publications in recent times. The study will provide a comprehensive overview of the structure of research and also identify the possible future lines of research with recent trends. Specifically, the study focuses on citation analysis, keyword analysis, co-authorship and bibliographic coupling to comprehensively reveal and analyze the publication contribution of the utility of google trends in the stock market activity. By developing network analysis through VOS viewer software research trends, emerging keywords, saturated areas, most contributing countries, journals publishing in this field, prolific authors are identified in the field of study. The study illustrates the insights of research behavior through bibliometric analysis. The review aids the practitioners, policy makers, educators and researchers in encapsulating the state-of-the-art development in the field of investor attention in stock market activity through web search-based behavior.

Database, Keywords & Inclusion
To gather the information relevant for the study, the data was retrieved on 16 th September 2021 from the SCOPUS Database. The period of study to retrieve papers was beginning from 2009 to the present date as the first study on Google Trends was conducted by (Ginsberg et al., 2009). The data retrieval process is shown in Figure 1. The most interchangeably used terms in the literature includes Google Search Volume Index, Google Trends, GSVI or Google. So, a string of appropriate search terms (TITLE-ABS-KEY ("GSVI" OR "Google Trends" OR "Google" OR "Google Search Volume Index" AND "Stock Returns" OR "Forecasting" OR "Predictability" or "Investor Attention" or "Investor Sentiment")) was used to search the relevant papers yielding 1334 initial results. Limiting search results to only English language articles, refined results included 1310 articles. To ensure inclusion of relevant articles, on the basis of subject areas and key words not related to financial markets were excluded and 445 articles shortlisted. Further, duplicates were removed and resulted in 442 articles. Articles were further screened on the basis of reading abstracts so that only relevant articles are included and finally the number of articles included in the bibliometric analysis were 274 papers.  The 274 publications under study originates from around 57 countries out of which top 20 countries are shown in Figure 3. However, maximum 15% of publications are from United States, 11% are from China followed by 7% publications from India. The contribution of countries like Australia, Canada and Hong Kong seems scanty.   (2014) is the second most cited paper but has the maximum link strength and is located in the center of map as shown in Figure 4 showcasing interrelatedness of the study with other research work. The study of Urquhart A. (2018) stands in top 5 documents as per citation but has link strength of 0. Network visualization map of the co-authorship as shown in Figure 5 represents that out of 556 authors only 170 meets the minimum threshold of 1 document with minimum 10 citations and out of that only 8 are linked. This reveals the weak co-authorship among authors in this particular field of research. Also, insignificant co authorship links found in institutions and countries.
Clustering of Author Keywords for Google Trends-related Literature analysis is conducted to identify the most associated keywords with the field of study. In the study, the frequency of the keyword's appearance is limited to at least 5 keywords. Out of 1248 keywords only 53 meet the threshold. Forecasting is the mainly used keyword with 79 occurrences meeting the threshold. Followed by forecasting, it is Google trends with 66 occurrences and investor attention with 52 occurrences. Initially, the research intended to prove whether we can use Google Trends for stock markets. The most used keyword is Forecasting which emphasis on utility of google trends data for the purpose of predicting and forecasting. Also, forecasting has the maximum link strength of 264 followed by google trends (168), search engines (134) and commerce (121). The link strength calculates the distance between nodes and is extracted by full counting method.
By examining the network analysis results of author and index keywords, it confirms that qualitatively there are at least 4-5 clusters. When Figure 6 is visually examined it shows the presence of 5 clusters. Clusters of keywords identified are represented in Table 2 with appropriate names given to these clusters. Three prominent key terminologies forecasting, google trends and investor sentiments are biggest circles of three different clusters green, blue and red respectively. The size of circle depicts the frequency of occurrence of that word. In network picture, circles of important items appear large. Blue cluster represents keywords which are related to sources like internet, social media, world wide web, google trends and their association with prediction and modeling technique used like ARIMA. Another Green cluster represents terminologies like machine learning, deep learning, data mining and their utility for forecasting financial markets and time series data forecasting. It embraces the utilization of artificial intelligence techniques for understanding the forecasting. Red cluster identifies areas like market efficiency, stock market returns, trading volume, bitcoin, Coronavirus and cryptocurrencies where google trends data is applied in the field of research. Yellow cluster specifies the interrelatedness of search volume data with commerce as a field of search, and firm strategy decisions regarding costs, and investments. Crude oil has also emerged as an area where researchers have applied search volume data.  Over the years, a plethora of minor keywords arose. Google trends and Forecasting are among the keywords in the center. These were the search terms that were already at the top of the list in terms of frequency. The initial years of research focused basically on utilizing Internet and Search Engines for the purpose of information retrieval. In 2014-16, the focus inclined towards forecasting and utilization of various machine learning and deep learning statistical tools. In 2018, Investor Attention was the heart of the research in this particular domain. Recent trends include studies related to COVID 19, Corona Virus, Bitcoin, Cryptocurrencies, and Time Series Forecasting. The emerging themes and topics with further scope are identified using keyword co-occurrences. There has been an emergence of analyses that are interlinked with other media. By synthesizing the results, two research streams of forecasting or predicting and determination of investor attention or investment sentiment can be recognized. The new trends identified includes research interest for evolving topics like cryptocurrencies and bitcoin.

Bibliographic Coupling
When two publications refer a common third publication in their bibliographies, bibliographic coupling identifies the connection between them. (Kessler, 1963;Martyn, 1964). There is a good chance that the two articles are addressing to a somewhat related subject matter. The connection is discovered by the statistical computation and is known as coupling strength or link strength (Boyak and Klavans, 2010). Higher link strength indicates more citations to other articles that the two publications share. As shown in Figure 8 network visualization map of bibliographic coupling with unit of analysis as documents in full counting method with minimum citations as 20 for a document identifies that out of the 274 publications, 40 documents had minimum of 20 citations of which 30 were connected. It represents three emerged clusters from the data and interconnectivity suggest that the field is emerging as a cohesive and connected field of research. The top 15 publications that have the highest link strength and highest citations are shown in Table 3    Bibliographic Coupling by Sources By analyzing the sources of the publication with count of three as minimum documents from a source along with minimum citations of 10, bibliographic coupling network visualization map in Figure 9 represents the 2 clusters of 16 linked publication journals. The presence of "Journal of Forecasting" in the center of the map as shown in Figure 9 indicates the dominance of the Journal in publishing articles in this domain. Table 4 shows journals with a significant number of publications. The journal with most publications is "Energy Economics" with 8 articles. It has highest link strength of 969. Journal with highest link strength means articles are more related to study field. The journal publishes articles majorly related to oil and crude prices but has great relevance for referencing with stock market equity related research. "Journal of Forecasting" has highest citation rate. With respect to published articles "Economic Modelling" journal stands second in both citations and link strength. Presence of some non-related journals in top 10 with less link strength and more citations proves interdisciplinary approach. Journal with low citation and high link strength depicts strong relationship between publications.  In context of average publication, the "International Review of Financial Analysis" is at the center of map. This reinforces the relevance of this journal in the field of research under study. Color variants of green represents average publication years around 2018-2019. The journals which are publishing in recent years includes Journal of "International Review of Economics and Finance", "Review of Behavioral finance" and "Global Finance Journal" as shown in Figure 10 in yellow color. The results seem consistent with aforementioned conclusions in context of link strength of this journal. The documents published in these Journals are not too much as they are the recent ones but link strength is comparatively good. The journals which were publishing in beginning but now have fewer publications are "Economic Modelling" and "Applied Economics". The bibliographic coupling of countries revealed that the 274 publications originated from authors in 57 countries. Of these, there are only 18 connected countries that have a minimum of five publications with minimum 10 citations from a source. Bibliographic coupling of countries publishing and utilizing Google Trends in the field of stock markets is explained in Figure 11 and Table 5. It showcases the most productive countries that uses similar literature in their publications. Unites States and China are the nuclei of references which has strong connections with other countries. Both belong to different clusters Green and Red respectively. The countries with the larger number of documents and citations are not the countries with the highest total link strength. Though United Kingdom and Germany has more citations in the research field but China has more link strength in comparison to number of citations as shown in Table 5. Japan is the only country in cluster 3 represented by blue color with 119 citations and 1527 link strength. United Kingdom is having maximum research link with European Countries like Germany, Poland, Greece and Italy. All these countries belong to same cluster 2 (Green). The developing Nations like India, China, Malaysia, Portugal belongs to cluster 1 (Red) and uses same documents in their literature as are used by researchers from these nations. Contribution of Portugal and Tunisia seems to be scanty in terms of least number of citations and South Korea in terms of least number of link strength.  India are most productive countries in terms of papers published but in terms of citations United States is followed by Germany and United Kingdom. Study of Zhang et al. (2021) on big data analytics and machine learning also identifies United States as most productive country followed by India, China and United Kingdom. The topic has been widely published in the Journal of Energy Economics and Journal of Forecasting. Most cited journal on the basis of bibliographic coupling is the Journal of Forecasting and the most influential journals with highest link strength includes Energy Economics and Economic Modelling. Keyword co-occurrence helps in identifying changing thematic evolution over time and determination of saturated topics and emerging topic. In beginning of the analyzed period, the prevailing keywords were related to information retrieval. From 2018, keywords specific to sentiments appeared and corroborates the relevance of studies and are consistent with the boom of publications. The emerging keywords identified includes bitcoin and cryptocurrencies. Bibliographic coupling results of documents reveals work of Vosen S. (2011) as most prestigious articles on the basis of highest number of citations followed by (Vozlyublennaia N., 2014). Bibliographic coupling results of authors acknowledges Molnar P. as prolific author in terms of citations and Wang S. with maximum number of publications with highest total link strength.

Bibliographic Coupling by Countries
With an intent to identify valuable insights on stock market dynamism with google trends, the existing literature is bibliometrically studied. The scholarly work reveals the importance of utilizing web search-based data from google in determining investment related decisions in stock markets. Although the study reveals and provides certain useful insights, there are few limitations of the study as well. Due to absence of any related term in the framed search criteria, there is a possibility of missing on few studies. The utility of Scopus as a database is also a limitation as all research work is not Scopus Indexed and added results can be fetched by searching other databases. Also, the study is focused on a precise aspect rather than analyzing the broader picture. So, the overall view is missing.
Google trends has the benefit of collecting large amounts of data, processing it to make analysis easier and even releasing it for free. As a result, Google trends is an excellent option for demonstrating the benefits and drawbacks of using big data. Furthermore, Google Trends research trends provide crucial information insights on how big data applications and usages are changing. The study is different from traditional literature reviews, as it helps new researchers to understand the extent of topic researched, its evolution, emergent and stagnated trends. Since the topic is an emerging topic and has gained attention of researchers in late years, there is still a lot of future scope is possible in terms of theoretical development, contextual coverage and methodologies. Utilizing google data for determining investor behavior has vast implications and interdisciplinary approach of combining information technology with finance, commerce and economics can guide the way.