Suggested citation format:
Ciolek, T. Matthew. 2002. Targets of Electronic Attention in Asia: who watches whom in the cyberspace? - an exploratory study. A paper for the panel on "The Culture and Politics of the Internet in Asia: Empirical, Theoretical and Methodological Issues," The Annual Meeting of the Association for Asian Studies, Washington D.C., USA, 4-7 April 2002.
www.ciolek.com/PAPERS/electronic-attention2002.html

Targets of Electronic Attention in Asia:
who watches whom in the cyberspace?
- an exploratory study

by
Dr T. Matthew Ciolek,
RSPAS, The National Institute for Asia and the Pacific,
Australian National University, Canberra ACT 0200, Australia
tmciolek@coombs.anu.edu.au

a paper for the panel on
"The Culture and Politics of the Internet in Asia: Empirical, Theoretical and Methodological Issues,"
The Annual Meeting of the Association for Asian Studies, Washington D.C., USA, 4-7 April 2002.

Draft
Document created: 20 Mar 2002. Last revised: 31 Mar 2002

0. Abstract

This paper undertakes the first ever study of the geographical organisation of hyperlinks among web sites in Asia. Five computer-aided surveys of Asian cyberspace were conducted between June 2001 and January 2002. The surveys reveal the existence of about 14.6 mln web links originating and terminating in the fifty countries of the continent. Analysis of data reveals that the bulk of "core" (i.e. Asia-Asia) hyperlinks are attracted by only two countries: Indonesia and Japan. The study also finds that the largest volume of Asian hyperlinks originates in four countries: Japan, South Korea, China, and the Philippines. However, the numbers of links originating in, or leading to, various places in Asia do not appear to be directly influenced by the actual numbers of internet hosts in those countries. Therefore, there is a possibility that other, i.e. non-technological variables may play a role. The paper concludes with comments on the methodology of webometric research, and on current conceptual models of global cyberspace.

1. Introduction

The last decade has witnessed a dramatic increase in the number of public access online documents, electronic archives, databases and corporate web-sites. The resultant "cyberspace" or realm of networked knowledge is borderless, ubiquitous and massive. It also continues to grow at a very fast pace. Towards the end of 2001 it comprised at least 2.07 billion visible web pages (Google 2001). A comparably large volume of "invisible," i.e. publicly inaccessible documents is also known to exist (Lawrence 2001, Bailey & Craswell n.d.). So far, however, all these massive and unprecedented developments have not been adequately studied. The cyberspace structures and transformations which underpin the largest information revolution since the times of Gutenberg tend to be neglected by scholars.

There are multiple reasons for such an unsatisfactory situation. Firstly, there are not many tools for the studies of cyberspace which would be both readily accessible and reliable. Some of the early studies in this field depended on massive computer resources and very complex programming environments (e.g. Larson 1996, Woodruff 1996). Another strategy was to make special arrangements with administrators in charge of servers' log files so that investigators could query the logs at will (e.g. Bailey & Craswell n.d). Although these methodologies tend to be reliable, they also are very expensive and cumbersome to apply. As such, they remain out of reach to the bulk of researchers with an interest in the Internet. There is also a group of works (Rodriguez & Manuel 1997, Thelwall 2001, Ciolek 2001, Smith & Thelwall 2001) which advocates analyses of contents of publicly accessible commercial search engines. Naturally, the use of such facilities is fraught with many methodological difficulties. Search engines are highly variable in the results they produce, their functions are poorly and/or incorrectly documented, their search logic is often opaque, and over time they change the search functions they offer. However, despite all these problems, the emerging consensus is that search engines can be used to mine online statistics quickly and inexpensively, providing that such data-extraction, as well as subsequent analysis, is done judiciously (Snyder & Rosenbaum 1999).

Secondly, there are major problems with the extent and completeness of our current knowledge-base. Published factual information concerning cyberspace remains sparse, unsystematic and patchy. Also, most of the relevant work is still preliminary in nature. However, a number of relevant studies already exist. These are: (a) a couple of exploratory (and microscopic in scale) analyses of citation patterns formed between scholarly documents (Rousseau 1997, Almind & Ingwersen 1997, Ingwersen 1998); (b) a handful of studies which parallel research on networks formed by phone calls (e.g. Louch et al. 1999) and which deal with the geographical organization of cyberspace at its intermediate level (Smith 1999a, 1999b, Smith & Thelwall 2001, Zook 2000, Ciolek 2001, Ciolek in press); and (c) about a dozen or so of the most general, almost abstract macro-studies addressing the size, and topology of the global web (e.g. Albert et al. 1999, Huberman and Adamic 1999, Broder et al. 2000, Lawrence 2001). Unfortunately, all these works publish only their summary and very terse statistics. This means that the research papers do not clearly separate their factual evidence from subsequent analyses and discussions, that the documentary information is presented on a select basis, and that their original detailed, unaggregated data are not made accessible at all for subsequent inspection, re-use and re-analysis by other researchers. In other words, in the field of webometrics, or the study of the organisation and behaviour of cyberspace, the chief currency is the final, synthetic conclusions of a researcher and not the actual research data underpinning those conclusions.

Thirdly, and possibly because of the above shortcomings in methods and data, very few general purpose (and testable) theories of behaviour and organization of cyberspace are being developed. In fact, incipient model-building exercises have taken place only at the cyberspace's macro-level. Two major approaches tend to stand-out. Firstly, there is research by Broder et al. (2000) which used two AltaVista crawls to analyse organisation of a sample of cyberspace comprising 200 million pages and 1.5 billion links. That study concluded that the macroscopic structure of the web is considerably more intricate than suggested by earlier experiments dealing with smaller samples of cyberspace. According to that study global cyberspace is shaped like a gigantic bow-tie. The large scale informational mainland is fringed with numerous tendrils and islands (Broder et al. 2000). This "bow-tie" is made of three distinct parts, the "core", and the "in" and "out" zones. The core is an archipelago of densely interlinked web pages. That body of online information represents some 28% of the total cyberspace. The core area is the target of attention of a further 21% of web pages organised in the form of an "in" zone. Those pages point to the core, although they are not linked-back from it. Thirdly, the core makes links to (though is not linked by) the additional 21% of cyberspace in the form of "out" pages. In other words, 70% of cyberspace's resources forms a massive series of chains of documents. The next 21% of cyberspace constitutes informational "tendrils" attached to the "in" and "out" zones. Tendrils attached to "in" spaces are accessed from that area via unidirectional links. The reverse situation arises among tendrils connected to the "out" zone. Web pages contained by such tendrils make links to the "out" zone, but they are not linked from it. Finally, the residual 8% of cyberspace was found to form an archipelago of isolated islands which are sometimes linked to each other, but which are always disconnected from the mainland's triad.

In addition to the work on cyberspace's macro-topology there is also research which explores the so-called "small world" problem. That line of study tries to determine the typical number of hyperlink jumps (clicks) necessary to traverse the shortest existing hypertext path connecting two randomly chosen web pages. Although cyberspace might be very big indeed it is proposed that its overall diameter might be, actually, quite narrow. According to data collected by Albert et al. (1999) the degree of separation between any pair of web pages appears to be no larger than 19 hypertext jumps. This dramatic conclusion is however disputed by Broder et al. (2000) whose data suggest, instead, that the diameter of the central core is at least 28 hypertext jumps wide and the whole of cyberspace could be up to 500 clicks wide.

In stark contrast, no equivalent detailed models have been developed to account for hyperlink structures discernable at cyberspace's meso-level. For instance, analyses of distribution of links between clusters of web sites have enabled researchers to come to a handful of initial conclusions. These tended to be either very specific or extremely abstract. In sum, their overall message is that the studied subsections of cyberspace are not only rich but also complicated environments. Secondly, it has been observed that discerned hypertext connections within subparts of global cyberspace are neither random, nor are they uniform. This means that the connections tend to be chaotic, that is, that they are a product of some very complex and currently poorly identified processes. The final conclusion is that within global cyberspace distinct international communities of interest are constantly forming. Such online archipelagos of interconnected documents comprise web documents dealing with a specific subject matter (Larson 1996, Gibson et al. 1998), or documents belonging to a particular language group (Smith 1999b), or to a particular cluster of organisations (Smith 1999a), or to clusters of countries (Smith & Thelwall 2001).

In sum, the current state of our knowledge about cyberspace is very general and rudimentary indeed. In consequence, we are unable to answer even the most elementary (and pragmatic) questions such as how cyberspace is structured along major geographic, economic and socio-political lines (= exactly who hyperlinks with whom in cyberspace, and how intensively); which institutions, countries and geographical regions act as the major local, regional and global exporters (producers) and importers (consumers) of online information and where do they tend to concentrate; and what category of online data is in the greatest demand in variously delineated areas of cyberspace. Surely, to answer these questions one needs to take a larger yet sufficiently detailed view of the universe of online information, so that both the proverbial trees and the proverbial forest can be seen at the same time. Such an approach implies that a fairly large number of observations from several different places needs to be systematically collected so that any underlying patterns and regularities, if present in the data, can be detected and inspected. This in turn signals the need for an inexpensive, swift and easy-to-use data-gathering method.

This paper proposes to take precisely these three steps. First of all, it will look at a possible logic of hypertext connections constructed at the national as well as supra-national levels. More precisely, it will try to understand the nature of cyberspace by analysing spatial relationships in hypertext connections among web sites from all of the 50 countries and territories of Asia. Secondly, our study will deal with very large but still manageable data sets. Indeed, the study will analyse statistics pertaining to many millions of hypertext links. Thirdly, these masses of information will be collected via an efficient and user-friendly set of automated tools.

As an exploratory operation, the study will not attempt to test any particular hypothesis. Instead, it will attempt to delineate and describe a hitherto unknown territory, and whilst doing so, refine some of the existing research tools and conceptual apparata.

2. Terminology

This paper makes use of a number of technical and geographical terms. The term "cyberspace" will be used here in a fairly narrow sense. It will denote a body of public access online information carried by "web pages" (defined below) spanned by a lattice of "hyperlinks" (defined below). Naturally, such a cyberspace can be defined through a wide range of variables. However, in this paper cyberspace will be dealt with exclusively in terms of its geography. Furthermore, the study will focus on an area which we will designate as the "core" cyberspace. By this term I mean that body of electronic information which is shaped by hyperlinks which both originate and terminate within its precise geographical (or other agreed upon) boundaries. Such an area is clearly distinct from two other types of online information, namely, the "inbound" and "outbound" zones. In our study, the former is established by links which lead to destinations within Asia although they originate outside the geographic boundaries of that continent. The latter is delineated by links which originate within Asia, but lead to other, non-Asian destinations. The lack of trustworthy data, however, means that neither the second nor third type of cyberspace will be discussed here.

The next term, a "web page" will mean an online, static, public-access document presented in hypertext (i.e. html) format. This usage refers to all types of text, image, sound and numeric files, regardless of their content, as long as they are hooked up to a network, and as long as they are indexed by some general-purpose search-engine such as Google, Fast or AltaVista. Therefore, our definition excludes those html documents which reside on an isolated intranet, or are dynamically created in response to some online query, or stay hidden behind a firewall, or require browser plugins, or are accessible by a password. In short, our web page will be that page which has been found and processed by a public search engine.

Thirdly, I shall make very frequent use of the words "hyperlink" and "web link", or "link", for short. Here each of these term means an explicit one-directional hypertext device for navigation from one web document to another. In fact, cyberspace as a whole is a product of the subtle yet energetic interaction between web documents, their contents and hyperlinks which glue pages together. For example, according to commercial investigations reported by Ward (2000), a "typical" web page contains about 52 hyperlinks knitting it tightly, albeit not always effectively (because about 10% of links always tend to be broken), with other parts of cyberspace. Hyperlinks are inevitably dynamic and complex phenomena. Sometimes one page may contain several links. Sometimes they may not be present on a web page at all. In that case a web page may be only "linked-to", but not "linked-back." Such dead-end web pages are an interesting feature as they constitute online cul-de-sacs. On the other hand, links may be plentiful and the hyperlinked pages may form informational chains, loops, and hierarchical trees of various lengths and sizes. Also, links - especially when they are generously applied - may create lengthy one- and two-directional passageways and thus form extensive navigational lattices, networks or labyrinths.

To carry-out a hypertext jump of any magnitude, web links rely on the uniqueness of the internet addresses of their destinations. Such addresses always state (or imply) the name of the document in question, and its place in a subdirectory. They also precisely identify the web server, a machine which is asked to supply the relevant document to the requesting party.

This study will take into account only one part such an address, namely the ccTDL part. The ccTDL, or Country Code Top Domain Level information, is a two character abbreviation (Internet Assigned Numbers Authority 2001) indicating the political or administrative entity in question, e.g. "jp" for Japan, or "bt" for Bhutan, etc.

By using the ccTDLs, one is able to gather information about the numbers of hyperlinks created in various Asian countries, and to determine their basic geographical relationships. Naturally, more grainy and precise studies, ones which take into account electronic addresses of various national and institutional sub-networks, are also feasible, but these will not be engaged in in this paper.

The second major group of terms used in this paper are geographical labels. As I said earlier, the basic units of our data-gathering operations are the ccTDL codes which succinctly identify the world's countries and territories. Because of the exploratory nature of this investigation our taxonomy of those codes will be simplistic. We shall look here at the cyberspaces of fifty Asian political entities. These will be grouped into five major regions:

Clearly, the above taxonomy is crude. It arbitrarily assigns fixed geographical locations to many fluid and ambiguous cases. For instance, Israel, Turkey, Georgia, and Armenia are defined in this investigation as parts of Asia, whereas Russia's Siberia - a territory without a separate ccTDL - is not. Similarly, two Indian Ocean islands, Christmas Island (cx) and Cocos (Keeling) Islands (cc) are classified here as a part of Australia, and, consequently, are excluded from this study. On the other hand, the UK governed British Indian Ocean Territory (Chagos Archipelago) (io) is treated in this paper as an integral part of South Asia.

Finally, a minor but important point. This paper freely uses colloquial variants of official geographical terminology. For example, "Burma" is used here instead of Myanmar, which is its current official name. Likewise, "North Korea" is used to denote the Democratic People's Republic of Korea, "Syria" is used instead of Syrian Arab Republic, and "Taiwan" instead of Taiwan, Republic of China (ROC) or, alternatively: Taiwan, Province of China. Needless to say, none of these cases indicates the author's disrespect for etiquette and political circumstances.

3. Methodology

The sample size
This paper is based on several large-scale sets of numeric data with information concering approximately 14.6 mln core Asian links. In addition to statistics about core hyperlinks, numeric data were also collected for approximately 352 million outbound Asian hyperlinks. However, the accuracy of that additional data set can not be, at present, adequately determined (see section below on "data correction"). Very likely, most (if not all) of the statistics pertaining to the outbound Asian links are systematically inflated by some, at this stage, unknown percentage. Therefore, only the Asia/Asia, i.e. the core materials will be discussed in this paper.

"Weblinksurvey" technique
The hyperlink statistics were collected during several automated data-collecting runs and have involved the use of "weblinksurvey" software (written in Perl). The software was written to interrogate the AltaVista (www.altavista.com) search engine. The underlying method, based on structured queries regarding numbers of hyperlinks leading (and not leading) to a given destination, was first proposed by Rodriguez & Manuel (1997), and then employed in studies by Rousseau (1997), Almind & Ingwersen (1997), and Smith (1999a). This very useful research technique was independently re-invented by this author in mid 2000, and automated in early 2001. The weblinksurvey software offers several advantages. It replaces one-off, repetitive and time-consuming manual data-gathering with a general-purpose, speedy (because computerised) procedure. The weblinksurvey program can make about 1,000 queries an hour to a public access online search engine. This speed is limited only by the speed of the network itself and the responsiveness of the search engine whose data it mines. A fragment of an output file (a specimen extracted from the June 2001 survey) produced by the weblinksurvey program is shown below : Here we can see that in June 2001 the AltaVista search engine knew about the following Asian links pointing to electronic resources in Taiwan: 264 links from websites in United Arab Emirates, 223 links from sites in Afghanistan, 115 links from Armenia, 333 links from Azerbaijan, etc. A detailed and extensive discussion of the weblinksurvey software and of the overall data-collection and data cleaning methodology are provided in Ciolek (2001).

The status of the field data

This study uses information extracted from the holdings of AltaVista. In December 2001 that search engine covered a worldwide territory comprising some 550 mln web pages. This feat has established AltaVista as the third largest SE in existence (Sullivan 2001), following Google (1.5 bln indexed documents), and Fast (625 mln indexed documents). Unfortunately neither Google nor Fast provide the unique search syntaxes offered by AltaVista and thus could not be queried by the weblinksurvey software.

Admittedly, mining contents of search engines has its serious methodological handicaps (Snyder & Rosenbaum 1999). Firstly, it means that our knowledge of cyberspace is always limited by the necessarily incomplete, simplistic and skewed knowledge a search engine has of its massive and intricate electronic environment. As Feldman notes, all search engines "typically index a biased sample of the web" (1999). Inevitably, some of the search engines may be more up-to-date and more comprehensive then others. However, they all tend to overemphasise the presence of the more visible pages, that is, pages with several links to them. This study makes an assumption that while AltaVista may show a strong bias towards recording details of the most linked-to pages on the Net it does not have any systematic bias towards pages dealing with any particular topic, or written in any particular language, or located in any particular part of the world.

In addition, it needs to be remembered that all search engines store messy information. It is so, because the intelligence they keep is collected on a continuous basis. Also, it is served by arrays of tens and hundreds of interlinked processors. Each of those processors has its own memory cache and is quasi-independent of the behaviour and knowledge of its neighbours. This means that information received in response to a query can vary strongly from one instance to another. On the whole, there seem to be three groups of variations in the responses of a given search engine. Firstly, there is the issue of freshness of the data: not all parts of cyberspace are indexed simultaneously. Secondly, the proportions of unique observations to duplicate observations always vary because not all parts of the SE database are updated simultaneously. Finally, there can be discrepancies in replies offered in response to permutations of the same set of questions. In theory, the sum of answers to a sequence of queries A and B should be identical with the sum of answers to a sequence B and A. In practice, these sums are often similar but not identical.

In this study the currency of data is not an issue. However, the other two shortcomings present a challenge. Therefore, it is hoped that the large numbers of observations, namely the 14.6 million data points, collected during a number of separate surveys which were conducted over a period of many months will help to neutralise any random variations in the data.

Data collection schedules
This study is based on several consecutive data collection runs dealing with both core and outgoing Asian hyperlinks. However, as I mentioned earlier, only the core links will be discussed here. Our continent-wide surveys of web links took place at roughly one month intervals, i.e. on (1) 8th Jun 2001; (2) 17 Jul 2001; (3) 30 Oct 2001; (4) 26 Nov 2001; (5) 18 Dec 2001; and (6) 17 Jan 2002.

Not all surveys were exactly identical. Survey no 1. unfortunately forgot to collect statistics on links dealing with the British Indian Ocean Territory (io). This problem was rectified in the five subsequent surveys. Also, due to a human error, in survey no. 3 original data which dealt with the core Asian hyperlinks have been accidentally overwritten and lost. Therefore, the final study is based on five, instead of six surveys. There is also another difference in our data runs. The first five surveys queried AltaVista with the aid of a formula "link:[ccTDL] -host:[ccTDL]." This expression means "give me a count (and a list) of all links pointing to a country designated by the ccTDL. Exclude from statistics information about those links which originate on internet hosts in that country"). However, in survey number six a related but slightly different expression was used: "link:[ccTDL] -domain:[ccTDL]." That change was small but effective. It eliminated the number of spurious references to those host, sub-network and network addresses which replicated the two character ccTDL code.

Raw data storage
The complete original output of each survey was stored in plain ASCII format. The full set was published as five documents, each annotated with Dublin-core metadata, on a public access web site (Ciolek 2002). The five data sets dealing with Asia-Asia hyperlinks together represent 250 KB of numeric data. They are freely available for public inspection, as well as research uses.

Data standardisation
In order to offset seasonal or random variations present in the contents of the AltaVista search engine this paper combines data from individual surveys of links between pairs of countries in form of an adjusted median value. This standardisation of measurements was done in two separate steps. Firstly, in each survey the highest value in each row of data was systematically excluded from the count. This adjustment was intended to narrow down the difference between the lowest and largest number of links counted for a given pair of countries. For example, in case of links leading from United Arab Emirates to Indonesia: 555 (June), 285 (July), 473,920 (November), 744 (December), 2,711,778 (January) it was the highest (i.e. January) value that has been excluded from the sample. As the second step, each group of individual values was summarised as median, or the value which occupies the mid-point within a set of measurements. For example, the median for above quoted sequence of counts of links from United Arab Emirates to Indonesia (with the January values being excluded) was equal to 650.

Data correction
In addition to being standardised, all measurements reported in this paper were also corrected. This is because the AltaVista's "link:ccTDL -host:ccTDL" command generates a steady percentage of spurious results (Ciolek 2001:Appendix 4). The above command tends to over-estimate the number of links coming from or leading to a given country. The exact magnitude of the error varies from country to country. An investigation carried out in December 2001 suggests that the following conversion ratios (expressed as percentages) apply to Asian ccTDLs: jp = 100; kz = 98; cn = 96; pk = 88; tr = 88; kg = 86; kh = 86; th = 76; lb = 68; id = 66; il = 64; kr = 64; ph = 58; tw = 58; uz = 58; ye = 56; bn = 56; jo = 54; tj = 52; bh = 50; vn = 48; sg = 48; hk = 44; lk = 42; am = 36; my = 32; kw = 30; qa = 26; ir = 20; ae = 20; np = 16; ge = 12; in = 10; az = 10; mv = 6; om = 6; tp = 4; bt = 4; tm = 2; sa = 2; mm = 2; mn = 0.5; mo = 0.5; bd = 0.5; sy = 0.5; la = 0.5; io = 0.5; iq = 0.5; kp = 0.5; af = 0.5. This means, for example, that when dealing with statistics involving the code "jp" 100% of measurements will pertain to Japan's cyberspace. However, in case of the code "vn" only 48% of counted links deal with Vietnam's cyberspace. Finally, in the case of the code "af" only a half-percent of links are likely to stem from (or lead to) Afghanistan's online resources.

Accordingly, all pairs of median values which were collected via "link:ccTDL -host:ccTDL" command had to be recalculated to reflect both the reduced "footprint" of the destinations (systematically overestimated by the "link:ccTDL" expression); and the reduced "footprint" of the sources (systematically overestimated by the "host:ccTDL" expression). For example, in a hypothetical case of 400 hyperlinks established between two countries, say, China and Japan

From/To CN JP
CN 100 100
JP 100 100

the corrected values, for data collected through "link:ccTDL -host:ccTDL" command, read:

From/To CN JP
CN 92 (=100*0.96*0.96) 96 (=100*0.96*1.0)
JP 96 (=100*1.0*0.96) 100 (=100*1.0*1.0)

However, in cases where the more accurate command, namely "link:ccTDL -domain:ccTDL" had been used, only the "footprints" of destinations needed to be adjusted:

From/To CN JP
CN 96 (=100*0.96) 100 (=100*1.0)
JP 96 (=100*0.96) 100 (=100*1.0)

The above data cleaning has been comprehensively applied to all statistics dealing with the core Asian hyperlinks. These results are presented below in the form of six tables.

4. Distribution of hypertext connections in Asia: numeric data

A. The spatial organisation of core Asian cyberspace: degrees of popularity and introversion

Table 1: Incoming and outgoing hyperlinks in core Asian cyberspace*
Source of links:      Type of link: Self-
directed
links**
Outgoing
links**
Incoming
links
Incoming as %
of outgoing
United Arab Emirates 1,092 13,886 3,688 27%
Afghanistan 2 3,405 173 5%
Armenia 4,374 1,085 14,679 1,353%
Azerbaijan 1,190 1,272 1,207 95%
Bangladesh 0 22 247 1,123%
Bahrain 703 217,753 2,854 1%
Brunei 3,493 142,626 24,154 17%
Bhutan 7 7,561 610 8%
China 658,133 63,163 74,162 117%
Georgia 390 551 2,708 491%
Hong Kong 102,505 278,896 31,498 11%
Indonesia 74,739 7,944 3,130,164 39,403%
Israel 246,949 59,297 18,908 32%
India 5,927 166,575 35,412 21%
British Indian
Ocean Territory
2 939 214 23%
Iraq 0 54 22 41%
Iran 573 40,011 11,807 30%
Jordan 1,558 75,836 2,991 4%
Japan 8,267,830 640,920 209,834 33%
Kyrgyzstan 9,267 221,503 6,547 3%
Cambodia 22 1,515 1,109 73%
North Korea 0 28 76 271%
South Korea 719,933 125,505 30,877 25%
Kuwait 778 4,589 10,621 231%
Kazakhstan 8,911 43,001 9,229 21%
Laos 0 680 70 10%
Lebanon 6,922 15,234 7,101 47%
Sri Lanka 3,078 327,334 10,123 3%
Burma 4 49 866 1,767%
Mongolia 4 268 101 38%
Macau 9 1,603 149 9%
Maldives 46 3,690 539 15%
Malaysia 28,606 143,330 69,145 48%
Nepal 660 960 7,474 779%
Oman 8 17 458 2,694%
Philippines 26,403 594,386 18,006 3%
Pakistan 24,951 2,021 39,848 1,972%
Qatar 82 112,674 19,575 17%
Saudi Arabia 44 1,363 1,273 93%
Singapore 88,171 59,731 30,152 50%
Syria 0 3,460 70 2%
Thailand 85,554 22,283 37,642 169%
Tajikistan 2,558 66,969 5,171 8%
Turkmenistan 11 60 794 1,323%
East Timor 11 50 1,169 2,338%
Turkey 156,195 19,865 29,100 146%
Taiwan 96,170 38,805 33,434 86%
Uzbekistan 1,238 50,381 2,402 5%
Vietnam 5,071 3,157 2,335 74%
Yemen 372 357,933 3,452 1%
Total links within Asia 10,634,546 3,944,240*** 3,944,240*** 14,578,786

* Comprises only those links which originate and terminate in Asia.
** Links which originate in the same source country.
** The sums of outgoing and incoming links which originate and terminate in the study area are, by definition, identical.

Findings:

A.1. The studied countries differ very strongly with regard to the online roles they play. All countries are found to receive and send hyperlinks. However, different strategies can be seen to emerge. Some countries are heavy attractors of incoming links, whereas others are heavy producers of outgoing links. In addition, some countries may choose to be net exporters of online information. In such a case they have far more incoming links than outgoing ones. Countries may also adopt a reverse strategy, and have decisively more outgoing links than incoming ones. In that case they inevitably play a role of online "watchers."

A.2. In terms of absolute numbers, the five most strongly linked-to countries are: Indonesia, Japan, China, Malaysia, Pakistan. These five countries attract between themselves 89% of all incoming links in core Asian cyberspace.

A.3. In relative terms, the five countries with the greatest proportion of incoming links to outgoing ones are: Indonesia, Oman, East Timor, Pakistan, and Burma. From a structural point of view, such countries appear to be archetypal information "exporters."

A.4. In terms of absolute numbers, the five least strongly linked-to countries are: Iraq, Syria, Laos, North Korea, and Mongolia. These five countries attract together only a miniscule fraction of 1% of all incoming links within core Asian cyberspace.

A.5. In relative terms, the five countries with the lowest proportion of incoming links to outgoing ones are: Yemen, Bahrain, Syria, Philippines, Sri Lanka. From a structural point of view, such countries are exemplary information "consumers."

A.6. In terms of absolute numbers of incoming links, Asian countries form five major classes: one case of a country with over 1,000,000 incoming links: Indonesia; one case of a country with 100,000+ incoming links: Japan; 18 cases of countries with 10,000+ incoming links: China, Malaysia, Pakistan, Thailand, India, Taiwan, Hong Kong, South Korea, Singapore, Turkey, Brunei, Qatar, Israel, Philippines, Armenia, Iran, Kuwait, Sri Lanka; 18 cases of countries with 1,000+ incoming links: Kazakhstan, Nepal, Lebanon, Kyrgyzstan, Tajikistan, United Arab Emirates, Yemen, Jordan, Bahrain, Georgia, Uzbekistan, Vietnam, Saudi Arabia, Azerbaijan, East Timor, Cambodia; 18 cases of countries with less than 1,000 incoming links: Burma, Turkmenistan, Bhutan, Maldives, Oman, Bangladesh, British Indian Ocean Territory, Afghanistan, Macau, Mongolia, North Korea, Syria, Laos, and Iraq.

Countries of Asian cyberspace can also be seen to vary strongly in terms of destinations of their outgoing links.

Table 2: Major destinations of hyperlinks in core Asian cyberspace
Source of links:     Destination: The
country
itself*
Other
countries
in the region
The rest
of Asia
Total % Total links %
of Asian
cyberspace
United Arab Emirates 7.3% 0.5% 92.2% 100.0% 14,978 0.1%
Afghanistan 0.1% 0.1% 99.8% 100.0% 3,407 0.0%
Armenia 80.1% 2.5% 17.4% 100.0% 5,459 0.0%
Azerbaijan 48.3% 1.9% 49.8% 100.0% 2,462 0.0%
Bangladesh 0.0% 0.0% 100.0% 100.0% 22 0.0%
Bahrain 0.3% 0.0% 99.7% 100.0% 218,456 1.5%
Brunei 2.4% 97.4% 0.2% 100.0% 146,119 1.0%
Bhutan 0.1% 0.1% 99.8% 100.0% 7,568 0.1%
China 91.2% 2.7% 6.0% 100.0% 721,296 4.9%
Georgia 41.4% 5.4% 53.1% 100.0% 941 0.0%
Hong Kong 26.9% 6.1% 67.0% 100.0% 381,401 2.6%
Indonesia 90.4% 4.5% 5.1% 100.0% 82,683 0.6%
Israel 80.6% 0.9% 18.4% 100.0% 306,246 2.1%
India 3.4% 0.1% 96.4% 100.0% 172,502 1.2%
British Indian
Ocean Territory
0.2% 0.1% 99.7% 100.0% 941 0.0%
Iraq 0.0% 0.0% 100.0% 100.0% 54 0.0%
Iran 1.4% 0.1% 98.5% 100.0% 40,584 0.3%
Jordan 2.0% 0.1% 97.9% 100.0% 77,394 0.5%
Japan 92.8% 1.0% 6.2% 100.0% 8,908,750 61.1%
Kyrgyzstan 4.0% 0.1% 95.9% 100.0% 230,770 1.6%
Cambodia 1.4% 94.1% 4.4% 100.0% 1,537 0.0%
North Korea 0.0% 100.0% 0.0% 100.0% 28 0.0%
South Korea 85.2% 3.7% 11.2% 100.0% 845,438 5.8%
Kuwait 14.5% 0.5% 85.0% 100.0% 5,367 0.0%
Kazakhstan 17.2% 0.1% 82.7% 100.0% 51,912 0.4%
Laos 0.0% 97.4% 2.6% 100.0% 680 0.0%
Lebanon 31.2% 0.5% 68.3% 100.0% 22,156 0.2%
Sri Lanka 0.9% 0.0% 99.0% 100.0% 330,412 2.3%
Burma 7.5% 18.9% 73.6% 100.0% 53 0.0%
Mongolia 1.5% 84.6% 14.0% 100.0% 272 0.0%
Macau 0.6% 36.5% 62.9% 100.0% 1,612 0.0%
Maldives 1.2% 0.2% 98.6% 100.0% 3,736 0.0%
Malaysia 16.6% 73.6% 9.8% 100.0% 171,936 1.2%
Nepal 40.7% 2.7% 56.5% 100.0% 1,620 0.0%
Oman 32.0% 20.0% 48.0% 100.0% 25 0.0%
Philippines 4.3% 91.2% 4.6% 100.0% 620,789 4.3%
Pakistan 92.5% 0.5% 7.0% 100.0% 26,972 0.2%
Qatar 0.1% 0.0% 99.9% 100.0% 112,756 0.8%
Saudi Arabia 3.1% 1.3% 95.6% 100.0% 1,407 0.0%
Singapore 59.6% 22.6% 17.7% 100.0% 147,902 1.0%
Syria 0.0% 0.0% 100.0% 100.0% 3,460 0.0%
Thailand 79.3% 13.1% 7.6% 100.0% 107,837 0.7%
Tajikistan 3.7% 0.0% 96.3% 100.0% 69,527 0.5%
Turkmenistan 15.5% 0.0% 84.5% 100.0% 71 0.0%
East Timor 18.0% 37.7% 44.3% 100.0% 61 0.0%
Turkey 88.7% 0.9% 10.4% 100.0% 176,060 1.2%
Taiwan 71.3% 23.4% 5.3% 100.0% 134,975 0.9%
Uzbekistan 2.4% 0.1% 97.5% 100.0% 51,619 0.4%
Vietnam 61.6% 21.1% 17.3% 100.0% 8,228 0.1%
Yemen 0.1% 0.0% 99.9% 100.0% 358,305 2.5%
Total % 72.9% 7.5% 19.6% 100.0% 14,578,786 100.0%
Total links within Asia 10,634,546 1,088,060 2,856,180 14,578,786

* i.e. self-directed links.

While Internet resources exist in all 50 studied countries and territories, there are profound differences in terms of places of origin of the majority of core Asian hyperlinks, as well as places to which these links lead.

Findings: sources of links

A.6. Numerically speaking, the smallest number of core Asian links was found to be generated in Bangladesh and Oman (22 and 25 links respectively), and the largest number of them was produced in Japan (8.9 mln). The median number of core Asian links/country is 24,564.

A.7. The most productive eight countries/territories are (in descending order of their hyperlink productivity): Japan (61.1% of Asia's links), South Korea (5.8%), China (4.9%), the Philippines (4.3%), Hong Kong (2.6%), Yemen (2.5%), Sri Lanka (2.3%) and Israel (2.1%) Together they produce over 12.4 mln links, or 85 % of all core links in Asia.

Findings: destinations of links

From a purely logical point of view there are three concentric groups of targets for hyperlinks originating in a given country. These are: (a) their own domestic resources; (b) foreign resources situated in the same geographic region as a given country; (c) resources situated in some other geographic region of the continent. Analysis of the data shows that the way these opportunities are exploited differ when studied at the level of the entire continent and at the level of individual countries.

A.8. At the level of the continent as a whole, almost three quarters (72.9%) of all core links in Asia are directed towards online resources situated within the cyberspace of the country from which hyperlinks have originated. In addition, about one link for every thirteen (i.e. 7.5%) is directed towards resources in the larger geographical region to which a given country belongs. Finally, about one in every five links (19.6%) is directed towards resources established in other parts of Asia.

A.9. The five most "self-orientated" countries are: Japan - 92.8% of core Asian hyperlinks from that country lead to resources located in the country itself; Pakistan - 92.5%; China - 91.2%; Indonesia - 90.4%; Turkey - 88.7%.

A.10. The five most "region-orientated" countries are: North Korea - 100.0% of core Asian hyperlinks from that country lead to resources located in the immediate geographical region; Brunei - 97.4%; Laos - 97.4%; Cambodia - 94.1%; Philippines - 91.2%.

A.11 The ten most "rest-of-Asia-orientated" countries are: Bangladesh -100.0% of core Asian hyperlinks from that country lead to resources located in other parts of Asia; Iraq - 100.0%; Syria - 100.0%; Qatar - 99.9%; Yemen - 99.9%; Afghanistan - 99.8%; Bhutan - 99.8%; British Indian Ocean Territory - 99.7%; Bahrain - 99.7%; Sri Lanka - 99.0%.

A.12 In terms of absolute numbers of outgoing links, countries of Asia form four major classes: 11 cases of countries with over 100,000 outgoing links: Japan, Philippines, Yemen, Sri Lanka, Hong Kong, Kyrgyzstan, Bahrain, India, Malaysia, Brunei, South Korea, Qatar; 13 cases of countries with over 10,000 outgoing links: Jordan, Tajikistan, China, Singapore, Israel, Uzbekistan, Kazakhstan, Iran, Taiwan, Thailand, Turkey, Lebanon, United Arab Emirates; 13 cases of countries with over 1,000 outgoing links: Indonesia, Bhutan, Kuwait, Maldives, Syria, Afghanistan, Vietnam, Pakistan, Macau, Cambodia, Saudi Arabia, Azerbaijan, Armenia; 12 cases of countries with less than 1,000 outgoing links: Nepal, British Indian Ocean Territory, Laos, Georgia, Mongolia, Turkmenistan, Iraq, East Timor, Burma, North Korea, Bangladesh, and Oman.

B. The spatial organisation of core-Asian cyberspace: links to individual Asian countries

Findings: destinations of links

Data relating to destinations of Asian hyperlinks, when analysed at the country-to-country level, show that Asian countries form two distinct groups.

B.1. The first group comprises those countries where over 50% of outgoing web links are directed to a single Asian-bound destination. To this group belong 39 countries/territories (in alphabetic order): Armenia; Nepal; Thailand; Burma; Pakistan; Taiwan; Macau; British Indian Ocean Territory; Bhutan; Iran; Israel; Kyrgyzstan; Mongolia; Syria; Hong Kong; Bangladesh; Malaysia; Lebanon; Kazakhstan; Yemen; United Arab Emirates; Saudi Arabia; Philippines; Cambodia; Maldives; Iraq; Laos; India; Kuwait; Tajikistan; Brunei; Afghanistan; Jordan; Qatar; Bahrain; Uzbekistan; Sri Lanka; North Korea.

B.2. The second group is formed by those countries where outgoing links are spread over more than one Asian destination. In those cases the most frequently linked-to Asian country does not receive more than 50 % of all Asia-directed links. To this group belong eleven countries (in alphabetic order): Azerbaijan; China; East Timor; Georgia; Indonesia; Japan; Oman; Singapore; South Korea; Turkey; Turkmenistan; Vietnam.

B.3. Among those countries studied, the range of preferences for links to Asian resources is extremely limited. Essentially, there are only two possible destinations - Indonesia and Japan. These two countries act as the most often linked-to electronic "ports of call" for almost all countries of Asia (see Tables 2 and 3).

Table 3: Countries as destinations of hyperlinks in core Asian cyberspace: detailed view
Source of links: Links to the
most linked-to
destination *
Most
linked-to
destination
% of all
outgoing
links *
United Arab Emirates 13,043 Japan 93.9%
Afghanistan 3,364 Indonesia 98.8%
Armenia 589 Indonesia 54.3%
Azerbaijan 575 Indonesia 45.2%
Bangladesh 19 Japan 86.4%
Bahrain 217,346 Indonesia 99.8%
Brunei 140,467 Indonesia 98.5%
Bhutan 5,047 Indonesia 66.8%
China 25,974 Indonesia 41.1%
Georgia 253 Japan 45.9%
Hong Kong 240,406 Indonesia 86.2%
Indonesia 2,701 Japan 34.0%
Israel 46,561 Indonesia 78.5%
India 161,929 Indonesia 97.2%
British Indian
Ocean Territory
624 Japan 66.5%
Iraq 52 Malaysia 96.3%
Iran 27,190 Indonesia 68.0%
Jordan 75,076 Indonesia 99.0%
Japan 278,862 Indonesia 43.5%
Kyrgyzstan 174,339 Indonesia 78.7%
Cambodia 1,444 Malaysia 95.3%
North Korea 28 Japan 100%
South Korea 49,921 Indonesia 39.8%
Kuwait 4,502 Indonesia 98.1%
Kazakhstan 39,712 Indonesia 92.4%
Laos 658 Indonesia 96.8%
Lebanon 13,932 Japan 91.5%
Sri Lanka 326,928 Indonesia 99.9%
Burma 28 Japan 57.1%
Mongolia 224 Japan 83.6%
Macau 1,001 Indonesia 62.4%
Maldives 3,553 Indonesia 96.3%
Malaysia 124,060 Indonesia 86.6%
Nepal 529 Indonesia 55.1%
Oman 4 India 23.5%
Philippines 565,055 Indonesia 95.1%
Pakistan 1,163 Indonesia 57.5%
Qatar 112,416 Indonesia 99.8%
Saudi Arabia 1,293 Japan 94.9%
Singapore 22,143 Indonesia 37.1%
Syria 2,943 Indonesia 85.1%
Thailand 12,652 Indonesia 56.8%
Tajikistan 65,955 Indonesia 98.5%
Turkmenistan 23 Japan 38.3%
East Timor 21 Japan 42.0%
Turkey 9,150 Indonesia 46.1%
Taiwan 23,391 Japan 60.3%
Uzbekistan 50,307 Indonesia 99.9%
Vietnam 1,496 Indonesia 47.4%
Yemen 334,122 Indonesia 93.3%
Total links to most
popular countries
3,183,071 80.7%

* Excluding self-directed links

Table 4: Countries as destination of hyperlinks in core Asian cyberspace: a summary
Most
linked-to
destination
No of
received links
% of
received links
No of
countries with most
numerous links
to the destination
Indonesia 3,125,991 98% 34 (64%)
Japan 55,580 2% 13 (26%)
Malaysia 1,496 0% 2 (4%)
India 4 0% 1 (2%)
Total 3,183,071 100% 50 (100%)

B.4. The most popular cyber-destination among the core Asian links is Indonesia. That country has 34 cyberspace "client" countries. These are (in alphabetic order): Afghanistan; Armenia; Azerbaijan; Bahrain; Brunei; Bhutan; China; Hong Kong; India; Iran; Israel; Japan; Jordan; Kazakhstan; Kuwait; Kyrgyzstan; Laos; Macau; Malaysia; Maldives; Nepal; Pakistan; Philippines; Qatar; Singapore; South Korea; Sri Lanka; Syria; Tajikistan; Thailand; Turkey; Uzbekistan; Vietnam; Yemen

B.6. The second most popular cyber-destination among the core Asian links is Japan. Japan has 13 cyberspace "client" countries. These are (in alphabetic order): Bangladesh; British Indian Ocean Territory; Burma; East Timor; Georgia; Indonesia; Lebanon; Mongolia; North Korea. Saudi Arabia; Taiwan; Turkmenistan; United Arab Emirates.

C. The spatial organisation of core-Asian cyberspace: links to geographical regions.

Further patterns emerge when data for individual destinations in Asia are aggregated into five groups: one cluster of countries for each of the major geographic regions.

Table 5: Geographic regions as destinations of hyperlinks in core Asian cyberspace*
Source of links:      Destination: East
Asia
West
Asia
South-
East
Asia
South
Asia
Central
Asia
Total
United Arab Emirates 13,084 1,173 694 24 3 14,978
Afghanistan 16 14 3,370 7 0 3,407
Armenia 227 4,508 688 21 15 5,459
Azerbaijan 134 1,237 1,036 51 4 2,462
Bangladesh 19 1 2 0 0 22
Bahrain 366 728 217,359 3 0 218,456
Brunei 216 37 145,855 8 3 146,119
Bhutan 2,501 2 5,048 17 0 7,568
China 677,915 3,158 35,921 1,249 3,053 721,296
Georgia 300 441 192 5 3 941
Hong Kong 125,888 2,367 251,615 1,415 116 381,401
Indonesia 3,435 510 78,453 266 19 82,683
Israel 4,946 249,832 49,758 1,554 156 306,246
India 1,519 758 164,038 6,168 19 172,502
British Indian
Ocean Territory
624 1 313 3 0 941
Iraq 2 0 52 0 0 54
Iran 12,725 610 27,226 21 2 40,584
Jordan 556 1,600 75,189 48 1 77,394
Japan 8,353,222 92,096 382,097 63,115 18,220 8,908,750
Kyrgyzstan 44,913 79 174,434 1,952 9,392 230,770
Cambodia 56 5 1,469 7 0 1,537
North Korea 28 0 0 0 0 28
South Korea 751,074 18,390 70,818 3,690 1,466 845,438
Kuwait 29 803 4,531 3 1 5,367
Kazakhstan 1,554 102 41,191 89 8,976 51,912
Laos 10 5 662 3 0 680
Lebanon 13,998 7,033 1,017 102 6 22,156
Sri Lanka 109 95 327,022 3,185 1 330,412
Burma 33 4 14 2 0 53
Mongolia 234 5 27 6 0 272
Macau 598 8 1,005 1 0 1,612
Maldives 6 119 3,558 53 0 3,736
Malaysia 14,738 1,407 155,071 578 142 171,936
Nepal 147 3 766 704 0 1,620
Oman 0 13 7 5 0 25
Philippines 26,985 924 592,503 360 17 620,789
Pakistan 389 151 1,322 25,089 21 26,972
Qatar 246 84 112,420 6 0 112,756
Saudi Arabia 1,298 62 36 9 2 1,407
Singapore 21,041 1,959 121,650 3,180 72 147,902
Syria 311 0 3,149 0 0 3,460
Thailand 5,939 1,247 99,663 914 74 107,837
Tajikistan 952 13 65,993 11 2,558 69,527
Turkmenistan 31 7 19 3 11 71
East Timor 27 0 34 0 0 61
Turkey 4,904 157,744 12,209 810 393 176,060
Taiwan 127,766 1,846 4,709 550 104 134,975
Uzbekistan 15 12 50,316 1 1,275 51,619
Vietnam 254 158 6,808 1,006 2 8,228
Yemen 9,335 393 335,557 13,019 1 358,305
Total links within Asia 10,224,715 551,744 3,626,886 129,313 46,128 14,578,786
Per cent
by Destination
East
Asia
West
Asia
South-
East
Asia
South
Asia
Central
Asia
Total
United Arab Emirates 87% 8% 5% 0% 0% 100%
Afghanistan 0% 0% 99% 0% 0% 100%
Armenia 4% 83% 13% 0% 0% 100%
Azerbaijan 5% 50% 42% 2% 0% 100%
Bangladesh 86% 5% 9% 0% 0% 100%
Bahrain 0% 0% 99% 0% 0% 100%
Brunei 0% 0% 100% 0% 0% 100%
Bhutan 33% 0% 67% 0% 0% 100%
China 94% 0% 5% 0% 0% 100%
Georgia 32% 47% 20% 1% 0% 100%
Hong Kong 33% 1% 66% 0% 0% 100%
Indonesia 4% 1% 95% 0% 0% 100%
Israel 2% 82% 16% 1% 0% 100%
India 1% 0% 95% 4% 0% 100%
British Indian
Ocean Territory
66% 0% 33% 0% 0% 100%
Iraq 4% 0% 96% 0% 0% 100%
Iran 31% 2% 67% 0% 0% 100%
Jordan 1% 2% 97% 0% 0% 100%
Japan 94% 1% 4% 1% 0% 100%
Kyrgyzstan 19% 0% 76% 1% 4% 100%
Cambodia 4% 0% 96% 0% 0% 100%
North Korea 100% 0% 0% 0% 0% 100%
South Korea 89% 2% 8% 0% 0% 100%
Kuwait 1% 15% 84% 0% 0% 100%
Kazakhstan 3% 0% 79% 0% 17% 100%
Laos 1% 1% 97% 0% 0% 100%
Lebanon 63% 32% 5% 0% 0% 100%
Sri Lanka 0% 0% 99% 1% 0% 100%
Burma 62% 8% 26% 4% 0% 100%
Mongolia 86% 2% 10% 2% 0% 100%
Macau 37% 0% 62% 0% 0% 100%
Maldives 0% 3% 95% 1% 0% 100%
Malaysia 9% 1% 90% 0% 0% 100%
Nepal 9% 0% 47% 43% 0% 100%
Oman 0% 52% 28% 20% 0% 100%
Philippines 4% 0% 95% 0% 0% 100%
Pakistan 1% 1% 5% 93% 0% 100%
Qatar 0% 0% 100% 0% 0% 100%
Saudi Arabia 92% 4% 3% 1% 0% 100%
Singapore 14% 1% 82% 2% 0% 100%
Syria 9% 0% 91% 0% 0% 100%
Thailand 6% 1% 92% 1% 0% 100%
Tajikistan 1% 0% 95% 0% 4% 100%
Turkmenistan 44% 10% 27% 4% 15% 100%
East Timor 44% 0% 56% 0% 0% 100%
Turkey 3% 90% 7% 0% 0% 100%
Taiwan 95% 1% 3% 0% 0% 100%
Uzbekistan 0% 0% 97% 0% 2% 100%
Vietnam 3% 2% 83% 12% 0% 100%
Yemen 3% 0% 94% 4% 0% 100%
Total % 70% 4% 25% 1% 0% 100%

* Regional statistics include, where appropriate, data on countries' self-directed links, e.g. self-links from Pakistan are classified as links from South Asia and pointing to that region.

Findings: sources of links

C.1. 81.7% of links to East Asia originate in Japan, 7.3% of links originate in South Korea, and another 6.6% come from China.

C.2. 45.3% of links to West Asia originate in Israel, 28.6% of links originate in Japan, and another 16.7% come from Japan.

C.3. 16.3% of links to South-East Asia originate in the Philippines, 10.5% of links originate in Japan, and another 9.3% come from Yemen.

C.4. 48.8% of links to South Asia originate in Japan, 19.4% of links originate in Pakistan, and another 10.1% come from Yemen.

C.5. 39.5% of links to Central Asia originate in Japan, 20% of links originate in Kyrgyzstan, and another 19.5% come from Yemen.

C.6. Over 90% of links received by East Asia and West Asia come from the top three producers.

C.7. Nearly 80% of links received by South Asia and Central Asia come from the top three producers.

C.8. South-East Asia receives links from the widest range of Asian countries. In that region links coming from the top three producers amount only to 36% of the total.

Findings: destinations of links

Theoretically speaking, countries can freely point their links towards any group resources of Asia. In practice, however, strong regional preferences emerge.

C.9. Countries with the most intensive electronic interest in East Asia are (in descending order of per cent of their hyperlinks pointed towards the region) North Korea - 100% of N. Koreans links which are directed towards some Asian destination are addressed to a place in East Asia; Taiwan - 95%; China - 94%; Japan - 94%; Saudi Arabia - 92%; South Korea - 89%; United Arab Emirates - 87%; Bangladesh - 86%; Mongolia - 86%;

C.10. Countries with the most intensive electronic interest in West Asia are (in descending order): Turkey - 90% of Turkish links which are directed towards some Asian destination are addressed to a place in West Asia; Armenia - 83%; Israel - 82%; Oman - 52%; Azerbaijan - 50%; Georgia - 47%; Lebanon - 32%.

C.11. Countries with the most intensive electronic interest in South-East Asia are (in descending order): Brunei - 100% of Brunei's links which are directed towards some Asian destination are addressed to a place in South-East Asia; Qatar - 100%; Afghanistan 99%; Bahrain 99%; Sri Lanka - 99%; Jordan - 97%; Laos - 97%; Uzbekistan - 97%; Iraq - 96%; Cambodia - 96%; Maldives - 95%; Indonesia - 95%; Philippines - 95%; India - 95%; Tajikistan - 95%; Yemen - 94%; Thailand - 92%; Syria - 91%; Malaysia - 90%; Kuwait - 84%; Vietnam - 83%; Singapore - 82%; Kazakhstan - 79%; Kyrgyzstan - 76%; Iran - 67%; Bhutan - 67%; Hong Kong - 66%; Macau - 62%; East Timor - 56%.

C.12. Countries with the most intensive electronic interest in South Asia are (in descending order): Pakistan - 93% of Pakistani links which are directed towards some Asian destination are addressed to a place in South Asia; Nepal - 43%; Oman - 20%; Vietnam - 12%.

C.13. Countries with the most intensive electronic interest in Central Asia are (in descending order): Kazakhstan - 17% of Kazakh links which are directed towards some Asian destination are addressed to a place in Central Asia; Turkmenistan - 15%.

C.14. The great majority (70%) of links spanning Asia are those directed to East Asia. The second most popular destination is South-East Asia (25%), and the third is West Asia (4%). At the same time South Asia and Central Asia receive 1% and less than 1% of links respectively. However, these above percentages of links vary most energetically when they are considered at the level of the individual countries from which they originate. There does not seem to be a single pan-Asian preference for establishing links to various regional destinations.

D. The spatial organisation of Asian cyberspace: links between geographical regions

Table 6 has three interlocking parts. Firstly, data are stated as absolute numbers. Secondly, they are expressed as percentages of links reaching a given destination. Finally, data are expressed as percentages of links leaving a given region of origin. In Table 6 data for individual countries are aggregated into five geographic regions, and all measurements are rounded to the nearest thousand.

Table 6: Geographic regions as destinations and sources of hyperlinks in core Asian cyberspace*
Source             Destination East
Asia
West
Asia
South-
East
Asia
South
Asia
Central
Asia
Total
East Asia 10,037,000 118,000 746,000 70,000 23,000 10,994,000
West Asia 62,000 426,000 841,000 16,000 1,000 1,346,000
South-East Asia 73,000 6,000 1,202,000 6,000 0 1,287,000
South Asia 5,000 1,000 505,000 35,000 0 546,000
Central Asia 47,000 0 332,000 2,000 22,000 403,000
Asia Total 10,224,000 551,000 3,626,000 129,000 46,000 14,576,000
East Asia 91% 1% 7% 1% 0% 100%
West Asia 5% 32% 62% 1% 0% 100%
South-East Asia 6% 0% 93% 0% 0% 100%
South Asia 1% 0% 92% 6% 0% 100%
Central Asia 12% 0% 82% 1% 5% 100%
Total %
by destinations
70% 4% 25% 1% 0% 100%
East Asia 98% 21% 21% 54% 50% 75%
West Asia 1% 77% 23% 12% 2% 9%
South-East Asia 1% 1% 33% 5% 0% 9%
South Asia 0% 0% 14% 27% 0% 4%
Central Asia 0% 0% 9% 2% 48% 3%
Total %
by sources
100% 100% 100% 100% 100% 100%

* Rounded to the nearest '000
** Regional statistics include, where appropriate, data on countries' self-directed links, e.g. self-links from Pakistan are classified as links from South Asia and pointing to that region.

Findings: sources of links

D.1. Strong regional differences arise also in terms of production of Asia-directed hyperlinks. 75% of links originate in East Asian region; West Asia and South-East Asia contribute 9% of links each; while South Asia and Central Asia generate 4% and 3% of intra-Asian links respectively.

D.2. Regional differences arise also in terms of the intensity with which various regions of Asia are hyperlinked to each other.

Findings: destinations of links

D.3. About 70% of Asian links crisscrossing the Asian continent lead to East Asia, 25% of them leadto South-East Asia, 4% to West Asia, 1% to South Asia, and less than 1% to Central Asia. This general trend applies only to Asia as a whole, and is influenced chiefly by a strong East Asian tendency to create "regionally introverted", i.e. region-focussed, linkages.

D.4. The other four geographic regions seem to observe a different logic:

5. Conclusions

It is time to bring all these different and fragmented strands of observations together. It is time to ask what kind of a general story the 14.6 mln core Asian hypertext links manage to tell us. Three sets of answers can be seen to emerge. The first group deals with Asian studies. The second and third, deal with the methodology and theory of webometric research.

Asian Studies' issues

The analysis of destinations of hyperlinks reveals that the Asia-wide tendency for links to be self-directed is influenced chiefly by cyberspace preferences of web sites in Japan. That country - as well as Pakistan, China, Indonesia,Turkey, South Korea, Israel, Armenia, Thailand, Taiwan, Vietnam, and, Singapore - clearly favours resources which are domestic over those which belong to other parts of Asia. However, the truth is that the great majority of Asian countries actually favour the very opposite policy. No less than 38 countries (or 76% of the total) clearly prefer to dispatch their links to places which are outside of their national boundaries. Another pattern also emerges. At a national level, the preferred destinations of outgoing core Asian links are surprisingly few. The most often linked-to places are Indonesia, followed by Japan. At a regional level, the range of choices is naturally very limited. There the most often linked-to regions are East Asia, followed by South-East Asia. That East Asia, the economic powerhouse of the continent, and a region housing more than 90% of Asia's networked hosts, can attract very strong online attention is not surprising. That Japan (64% of Asia's hosts) (Ciolek, in press), can attract very strong online attention is not surprising either. What is surprising, however, is that the South-East Asia region (approx 6% of computers) is able to attract 25% of outgoing links (incl. self-directed links), and Indonesia (less than 1% of Asia's networked computers) is able to attract 79% of such links. These sets of figures, when put side by side, cannot be reconciled with each other.

Moreover, it is not clear at all why Indonesia should be a country most heavily linked from such dissimilar venues as Afghanistan and Armenia, whereas Japan should receive a heavy amount of linkages from equally dissimilar venues such as Turkmenistan and Georgia. All this indicates, again, that the reasons for which creators of web pages and architects of web-sites build links to a given set of destinations need a separate and intensive investigation. Is Indonesia so attractive to links from other Asian countries because it has lots of small-scale web sites with each being a separately tempting target; or is it because Indonesian sites are polyglot and freely use a mixture of English and Bahasa Indonesian; or is it because they use a simple ASCII character set and thus are easily indexible and findable; or is it because their social and administrative controls over the content of their web sites are less strong than in other parts of Asia? Or is it the case that the observed strong preference for links to Indonesia is simply a temporary aberration within the holdings of the search engine, an aberration which will not be found in subsequent surveys? These are questions which need to be systematically explored.

Secondly, the analysis of sources of core Asian hyperlinks shows that their largest numbers come from Japan, South Korea, China, and the Philippines. The findings are intriguing, because in 2001 China and the Philippines represented 2% of all networked computers in Asia (Ciolek, in press). Yet, as this paper discovers, together they used that equipment to construct over 9% of hypertext links in Asia. On the other hand, the combined numbers of networked computers in Japan and South Korea in 2001 amounted to 70% of Asia's networked infrastructure, and the two countries taken together have constructed 67% of core Asian hyperlinks. Therefore, a secondary question arises: do countries with fewer computer resources make a more intensive use of the opportunities offered by WWW technology? Most probably, not. This paper finds that in terms of absolute numbers Hong Kong, Yemen, Sri Lanka and Israel have generated very similar volumes of links. Yet in terms of the numbers of networked computers Hong Kong (3% of Asia's networked hardware) and Israel (2.5%) matched, more or less, their production of hyperlinks (2.6% and 2.1%, respectively). Simultaneously, Yemen and Sri Lanka, each with miniscule portions of 1% of Asia's networked machines, had generated 2.5% and 2.3% of links, respectively. This would suggest that in addition to purely technological factors, cultural and other variables must also be at play.

In sum, all these numeric results need to be firmly incorporated into the framework of such well established disciplines as human geography, political science, economics, sociology and anthropology. By themselves our data signify little or nothing of value to Asian studies.

Methodological issues

The main webometric lesson stemming from research reported in this paper is that a large scale study of cyberspace at its intermediate, i.e. meso-level is practicable. Data pertaining to the "anatomy and physiology" of an extended electronic realm can be collected in large numbers indeed. Moreover, it is possible for our observations to be linked to a particular political or geographical entity. Finally, it is possible to aggregate our observations and express them statistically in reference to increasingly broader contexts such as geographical regions, or politico/economic blocks.

Naturally, all studies using AltaVista's expertise rely on a tacit assumption that AltaVista data indeed constitute a random sample of a large sub-set of all public access online documents. Obviously, such a crucial assumption badly needs to be tested. Tools for critical comparisons of AltaVista's holdings with those of cyberspace at large are now urgently needed.

Another conclusion of this paper is that the research of cyberspace strongly resembles a study of an intricate Persian carpet. So far, we have glimpsed only of a fragment of a whole of unknown size. Therefore, future work will greatly benefit from an approach which reveals the full structure of observational data, one which does so without restrictions caused by boundaries which are too narrowly defined. Moreover, such an approach needs to be able to show both the interlocking and nesting among any of the observed patterns.

In other words, the proper study of Asian cyberspace implies nothing less than a complete, country-level survey of the entire global cyberspace. To fully understand that which is within, we need to understand that which is whithout, we need to understand the context of our observations. It means, therefore, a study of the country-level destinations for all outgoing links which have origin in Asia. It also means an investigation of all incoming links which address a target country in Asia. Fortunately, that is a feasible task. As this paper has demonstrated, the data collection stage can be easily accomplished. What remains difficult, however, is the subsequent data verification and cleaning. So far, tools have been developed to deal with observations pertaining to core Asian cyberspace. Now we need to develop techniques which expertly handle the remaining parts of the geographical world. Naturally, such techniques need to be widely publicised within the fledgling community of webometricians. This could be done through a number of communication channels, including the recently established (January 2002) specialist mailing list "webometrics@coombs.anu.edu.au".

Theoretical issues

Finally, the author of this exploratory paper is pleased to have made a couple of small theoretical advances.

First of all, our study has shown that the initial and self-evident distinction between the outgoing and incoming links needs to be augmented. A more coherent and more accurate picture of cyberspace emerges if we distinguish three types of links: the self-directed, outgoing and incoming ones. In addition, these can be grouped into two distinct classes: the produced links which comprise all self-directed and outgoing links for a given cyberspace node; and the received (i.e. incoming) links.

The above distinctions can be very handy. They can be immediately put to work to measure and describe key webometric properties of variously sized bodies of online information. Firstly, we are able to use them to devise an index of online "introversion." This can be expressed as a proportion of the self-directed to produced links: the larger the index, the greater is the focus on resources which are internal to a given cyberspace. Secondly, we can synthetically measure the hyperlink "productivity". That index is stated as a proportion of the produced links to the number of their web pages (or to the number of web servers, or internet hosts) on which they reside. The third possible index measures an overall "attractiveness" of a given body of online documents. This index is defined as a proportion of the received links to the number of their web pages (or the number of web servers, or internet hosts) on which they reside. The index, therefore, is identical to a measure called "Web Impact Factor" or WIF, first proposed by Rodriguez i Gairin and Manuel (1997) and Ingwersen (1998), and extensively used studies by Smith (1999a, 1999b) and Smith and Thelwall (2001). The fourth and final measure captures the degree to which various portions of cyberspace seem to be "hungry" for online information. That index is defined as a proportion of outgoing links to incoming ones. Again, the resultant value is meaningful: the larger the index, the more intensively a given group of websites can be said to pursue externally based information.

There are other advances as well. This study has made practical use of a conceptual distinction between three elementary subsets of online materials. The first of these is the so-called inbound cyberspace, a body of online information which contains unilateral (i.e. unreciprocated) links directed to core cyberspace. The second subset is the core cyberspace itself, that is a bounded body of online information which contains links originating and terminating within those boundaries. Such boundaries can be established for any aspect of the studied phenomena and set at any scale, and at any level of abstraction. Also, they can be of any type. They may rely on spatial, semantic, and social distinctions; or be defined as any combination of these (or other) criteria. The third of our subsets deals with outbound cyberspace. This is a body of online information which contains unilateral (i.e. unreciprocated) links received from core cyberspace. Therefore, the whole tri-partite structure resembles a planet with a host of surrounding satellites which lie on two very distinct orbits.

It can be schematically represented as the following formula:
A,B,C => (X<=>Y) => D,E,F

where "=>" and "<=>" denote one-directional and two-directional links; web pages "A", "B" and "C" form the inbound space; "X" & "Y" form the core space; and "D", "E", "F" represent the outbound space.

At first glance the "orbits" model looks like a mere replica of the "bow-tie" model suggested by Broder et al. (2000). However, a closer inspection reveals that differences between the "orbits" and "bow-tie" models are both subtle and basic. In Broder et al. (2000) the regions of cyberspace containing links leading to and leaving from their central, densely interconnected space are much larger than those in our model. It is so because they can, in principle, contain web documents linked to one another in the form of lengthy indirect hypertext chains, e.g.

.... => A3 => A2 => A1 => (X<=>Y) => D1 => D2 => D3 => .....

However, the "orbits" model limits the scope of inbound and outbound only to those documents which have a direct hypertext connection with the central area.

The second major difference applies to the nature and scope of the central area itself. Broder et al. (2000) state that their central zone is a portion of cyberspace where documents are densely interlinked with each other. In our model, we do not need not make that stipulation. We merely state that the documents in question reside within a certain clearly defined boundary and that all such documents, by definition, must be somehow linked. However, the density and directions of those links are left undetermined.

The third difference is equally important. Broder et al. (2000) describe global cyberspace as a single entity, and therefore they count each of its individual elements (e.g. web pages and links) only once. Our model, on the other hand suggests that cyberspace may be envisaged to be hierarchically organised and that its various component parts may act simultaneously at one, two, three or more levels and thus may play very different yet still overlapping roles. For instance a web page (and its links) in country K may be a part of core cyberspace of region R, and a part of inbound cyberspace of region Z. Such hierarchies and overlaps of roles have not been envisaged in the original formulation of the "bow-tie" model.

Finally, Broder et al. (2000) postulate an existence of archipelagos of sub-cyberspaces which are fully disconnected from the central area, as well as from input and output zones. By contrast, the "orbits" model does not dwell on the existence or non-existence of disconnected bodies of online information. It simply deals with those documents which establish some form of connectivity between themselves. It also assumes that the overall areas of core, inbound and outbound cyberspaces are not fixed in size, but to the contrary, may grow and shrink in their scope not only in response to the changes in the overall numbers of hyperlinked online documents, but also in response to the scope and granularity of our research questions.

In other words, Broder et al. (2000) suggested that global cyberspace always has five parts: the three central ones, tendrils and the disconnected portion. By contrast, our work suggests that global cyberspace may have any number of component non-communicating parts, from 1 to N, and that each of these individual, because disconnected, parts always has an identical structure: a core area and two satellite sub-cyberspaces.

I believe, therefore, that our "orbits" approach models the organisation of cyberspace more flexibly and more realistically.

6. Acknowledgements

My thanks are due to Jane Moran for her useful comments on the first draft of this paper.

7. References

The great volatility of online information means that some of the URLs listed below may change, or disappear altogether, by the time this article reaches its audience. Fortunately, since early 1996, most of the web sites world-wide are now systematically tracked and permanently archived by The Internet Archive at the www.archive.org address.
The date in round brackets indicates the document's version (as stated by the source itself, or the date it was last accessed by this author).

8. Version and Change History


Site Meter
visitors to www.ciolek.com since 08 May 1997.

Maintainer: Dr T.Matthew Ciolek ( tmciolek@ciolek.com)

Copyright (c) 2002 by T.Matthew Ciolek. All rights reserved. This Web page may be freely linked to other Web pages. Contents may not be republished, altered or plagiarized.

URL http://www.ciolek.com/PAPERS/electronic-attention2002.html

[ Asian Studies WWW VL ] [ www.ciolek.com ] [ Buddhist Studies WWW VL ]