From Private Ink to Public Bytes:
the practice and theory
of future GIS-ready online cultural data-sets

Dr T. Matthew Ciolek,
Research School of Pacific and Asian Studies,
Australian National University, Canberra ACT 0200, Australia
tmciolek@coombs.anu.edu.au
http://www.ciolek.com/PEOPLE/ciolek-tm.html

To be presented at the Electronic Cultural Atlas Initiative (ECAI) session of the
Pacific Neighborhood Consortium (PNC) Annual Meeting,
City University, Hong Kong, PRC,
15-20 January 2001 [a first and never completed draft - tmc, Aug 2024]

Document created: 3 Oct 1999. Last revised: 1 Oct 2000.

0. Abstract

1. Introduction

A meeting in North African city

Once upon a time, or more exactly, on a warm spring day of 'anno urbis conditae' MCXXVII (374 CE.) in a provincial city of Hippo Regius [Long: 7.733333, Lat: 36.866669, ADL Feature ID: 1050346], a young but already promising jurist (and poet) Lucius Afranius went to do some purchases in a reed baskets' shop. Upon completing the transaction he re-entered the street and spotted there, next to a stall with incense and medicinal herbs, two eminent scholars: Rufus Optimus and Aetius Maximus. The first man was a learned visitor who a few months earlier came to Hippo from as far as Leptis Magna. The second one was a local celebrity. The following is an imperfect reconstruction of a three-way conversation which ensued.

Afranius: "Ave, amici. Now, with all your exotic shopping completed, where are you proceeding to?"

Optimus: "Ave, young man. Good to see you. My good friend Maximus has invited me to his villa, in the Curtius hills, to show me an olive tree he planted there the other day."

Afranius: "What is so exciting about such a tree? You saw one, you have seen them all, that's my opinion."

Maximus: "Ave, Lucius Afranius. The olive sapling is merely an excuse. We need one, for the great Rufus Optimus and I are going to have a lengthy, and much needed dare I say, exchange of ideas. Verily, the best way to discuss things in a serious manner is the peripatetic way."

Optimus: "We are discussing, dear Afranius, the best way to collect and record information about trade routes which ply the seas as well as provinces of our illustrious Empire."

Afranius: "Why this sudden interest in the trade routes? Are you planning, noble sirs, a business venture?"

Maximus: "No, my impressive young colleague. Not a business venture. An intellectual one. I view them as a good example of an Eckenstein Boulder."

Afranius: "A boulder?, an Eckenstein Boulder?".

Optimus: "Yes. Before you do truly difficult things, you should first try your skill (and luck) on simpler ones. Before climbing a tall mountain, you needt to test your rock-climbing techniques on a small, but sufficiently challenging scale. For details of such Eckenstein's training device see (Newby 1974:37) or Ciolek (2000. Digitising Data...). In short, if you manage to handle minur difficulties of the boulder all right, you can progress to tackling more complicated things, such as climbing snow-clad mountains."

Afranius: "I see. Such as?"

Maximus: "Such as handling data about polygons."

Afranius: "What do you mean? I am lost."

Optimus: "Handling data about roads and communication lines, whether they are used for trade, pilgrimages or merely movements of our messengers and legions, boils down to handling data about points or places, Ciolek (2000) calls them 'nodes', and the data about interconnecting lines."

Afranius: "No, this cannot be so. Roads, let alone maritime shipping lines, are full of curves and sudden changes of directions. A simple direct line cannot adequately represent a variety of paths a traveller takes travelling from one city to another, from one harbour to the next one."

Optimus: "No worries, it can. A straight line in Ciolek's conceptual scheme is always a generalization, a temporary hypothesis, established only to be replaced by subsequent more detailed determinations. Any curved line, regardless how much it weaves and meanders can always be constructed from n straight segments. These segments can be kilometers or just a few meters or centimeters long. No worries, then."

Afranius: "Now I see. A clever approach. A very clever one, because it means that when collecting data on all the roads which lead to Carthago (pardon this joke, I could not resist it), you do not worry, prematurely at least, about topography and exact topographical detail, you simply collect data about location of as many as possible points through which these roads lead, and the plot them on a map, and draw all the interconnecting lines."

Optimus: "Exactly, my friend, exactly."

Afranius: "But still, I am puzzled. If you are in such a good agreement on the Ciolek's methodology, why do you have to walk as far as those distant hills, to Maximus' villa and to the freshly planted olive tree? Certainly your conversation could be completed in far shorter time. You could take a nice stroll to the forum instead. Will your conversation be about the polygons, by which I take you mean a closed (like a loop) series of nodes connected by short stretches of lines which can be used to identify some area, small or largem does not matter here, such as the wheat-fields of or city, or that of various provinces of our Empire, or the multifarious and much overlapping and tangled areas of terrains inhabited by speakers of a given tongue?"

Maximus: "Well (to use a Celtic way of signalling lenghty utterances), our conversation-to-be will not deal much with points, lines or polygons. The first two items have been already amply discussed. The third one represents a separate problem, one which we shall look at some other time. The reason why we walk towards my house and the increasingly more famous olive tree is a different one. It deals with the manner we scholars should manage new computing and networking technologies. It deals with the question of how should we take the true advantage of the opportunities afforded by these technologies."

Afranius: "Computing and networking? Porca miseria! What are they?"

Optimus: "Do not worry. These are electronic tools. They will be invented long after you completed your long and fruitful life."

Afranius: "I see. Carry on, Maximus. What do you mean by the true advantage?"

Maximus: "You see, normally we are accustomed to making very good but old uses of new technology, whereas what is really needed is the new use of it, one which is commensurate with the full power offered by a new tool, by a new resource."

Afranius: "For example?"

Optimus: "Examples are easy. Worthwhile practical solutions are not. Millennia ago our ancestors used stone choppers and wooden clubs to hammer their oponents into submission. The advent of metal, bronze and iron, changed all that. Now warriors could use swords. However, they did not use their weapons to bash or slap, but, instead to slash and stab. The time-honoured practice of bashing someone on the head with a lump of metal would be a silly operation indeed."

Afranius: "Sure, you made your point eloquently. So the two of you are trying to figure out how these computers and electronic networks should be put to a sensible scholarly use, not necesserily the one to which we have been accustomed on the basis of past experiences "

Maximus: "Yes. The key questions are manyfold. For instance, how do we arrange our research and publishing activities that: In short, how do we bridge the gap between the culture of 'Ink' and that of 'Bytes', and have the best of the two worlds."

Optimus and Afranius [calling out unisono]: "'Ink' and 'Bytes', what do you mean?"

Maximus: "I will explain these terms later. But essentially, what I mean is a simple but nagging question 'How do we move from the condition where information which is used as an illustration, that is, as a mere self-satisfied and static picture aiding some point in a discourse to a situation where information is fragmented into a myriad of elements, with each of one being capable to be used as verifiable datum, as a computable, interchangeable, and correctable fact'."

Optimus: "Wow, a tall order indeed."

Afranius: "Is the house far enough? It sounds like an interesting morning. May I join you in your peripatetic exercises?"

Maximus and Optimus [jointly]: "Yes, of course. Come with us. And we promise, you will not regret it."

Optimus' tale:

Afranius: "So, where do we start our journey?"

Maximus: "At the beginning. We shall start by asking ourselves, how would we go about collecting and mapping data, say, on the trade routes and other movement and communication lines, established between Alexandria in Egypt, and Merv in Persia. Optimus, my friend, how would you proceed, assuming that you have no other obligations. And - also - assuming, that you have access to all the equipment and software you need for the task."

Optimus: a. drawing a series of maps (sketch or detailed) b. direct digitisation and input into a GIS system c. full-text database of images and text-fragments

However, problems with each of the approaches - problems with the sources - their are messy - problems with technology * mapping: drawing is fiddly, needs expertise, lots of time and ADDs to the confusion and crowding * direct digitization: scanning, GISing require costly equipment, need expertise, are fiddly (what do you do with old maps (illegible, no coords, no proj) * mess is searchable, but it continues to be mess.

Maximus' tale:

a. some general thoughts - all information consists of fragments which have triple aspect (structure,content,presentation) - one needs to capture them - one needs to use them as lego blocks

b.possible strategies - XML-based annotations - 'strong' database of information fragments - 'weak' database of information fragments

Advantages and disadvantages of each of the approaches

Optimus is curious, asks for elaboration. Maximus obliges: - fragment all inf. into constituent semantic elements - use the same fragmentation procedure for ALL information, a common template - bring fragments togetheer into data sets organised by place and date - place the sets into containers which (a) provide data themselves (i) original (ii) derivitatives; (b) directional meta-data; (c) modifications meta-data; (d) analysis meta-data; - provide the each data with geographic coordinates - use them to do: (a) statistical analyses; (b) create maps (using a GIS software) as you please.

Optimus is exited, and asks for more information. Maximus obliges:

2. Practice

Basic principles

* all information is collected to a common format * all data and procedures are made explicit * all data and procedures are made public * all data are correctible * all data reside in well shaped containers * all information can be collected by hand or by a computer * all information ends up online * the principle of public visibility * each fragment of information being uniquely identified * each fragment of information being always linked to it source * each fragment of information is evaluated Lines QA3, Places (exact or estim coords) * all data about lines carry problem/no-problem tag * data production is separated from data analysis (however initial data analysis is used to provide data quality check) * each fragment is treated like a hypothesis * hypotheses have hierarchies of certainty * several hypotheses pointing in a certain direction enable us to make an informed decision * principle of separation of Data from Conclusions, data points from maps (which are models) * principle that you can mix and match records but you can always return to original data-set, and to the meta-information about it, and - therefore - to the source. * principle that ultimately it is not the place-name, not its coords but its ID number which keeps the bastards honest.

Basic steps

  1. Locate the source of information
  2. Take the record (xerox or simple OWTRAD notes)
  3. Expand the notes
  4. Shape the OWTRAD notes, make them fully fledged
  5. Create a temp list of novel Nodes
  6. Check them against OWTRAD GAZETTEER
  7. For all missing nodes
  8. Proofread the initial data
  9. Load the initial data into data set container (XHTML compatible, meant to sit online)
  10. Give the container a meaningful name.
  11. Create meta-data, according to the OWTRAD template
  12. Load the initial data into Places Processor
  13. Match data-set names with Gazetteer names, obtain coords
  14. Export the Places data
  15. Load the initial data into Routes Processor
  16. Match data-set names with Gazetteer names, obtain coords
  17. Export the Routes data
  18. Add Places headers
  19. Add Routes headers
  20. Import places into a GIS software, display them on a map.
  21. Visually inspect and verify the correctness of the data (use 2 & 4 as control)
  22. Correct, repeat previous steps till satisfied
  23. Import routes into a GIS software, display them on a map.
  24. Visually inspect and verify the correctness of the data (use 2 & 4 as control)
  25. Correct, repeat previous steps till satisfied
  26. Store final results of 21 and 24 in a data-set container (point 9 above)
  27. Store the container + dara online
  28. Notify interested parties about its existence, invite corrections
  29. Act on corrections
It is an idealised picture, in poractice corners are cut.

Basic tools

- container template - notation system - gazetteer of coords - database keeping track of id numbers - dbase data convertors - template - populated templates (place1, place2, place3, routes1, routes2, routes3), one for each of the data sets. The idea is to put emphasis on usable, disseminable data sets, not on a central database (a mausoleum). Databases are implicitly guarded against changes, data sets are not. - notebook with facts to guard the truth against outrageous claims made so frequently by ourselves or others. (in fact can be made into a separate data-set)

Summary

- OWTRAD methodology is suitable for creation all types of data based on points and lines (but - not yet - for polygons[areas] - it is a general purpose system. It can handle information about

3. Theory

Optimus: "There is a method in this madness. Could you elaborate?" Maximus obliges: * Informational structure of scholarly endavours 3 levels: theory, models and data 5 aspects: (i) actual content; (ii) termonology; (iii) methodology; (iv) apparatus and documentation; (v) meta-data Afranius: "well it looks like another of your belowed tables, matrices of relationships". Maximus: "Yes, indeed. But in fact the situation is more complex There are info-soup layers between data and models and models and theory layers.

- data and models: need informational hooks, to other data and other models. * to verify them, * to synthetise them - hooks in form of references to places, dates, names of individuals (or at least names of peoples or cultures)".

Optimus and Afranius, unisono: "Why?"

Maximus: It has to do with the transition from "ink" to "bytes'" way of thinking about our work. An example? Negroponte (1995) made a distinction between atoms and bits (19...), I suggest a distinction between Ink and Bytes. It is in fact a symbol for two modes of thinking: INK: (a) private, (b) preoccupied with content; (c) idiosyncratic, (d) hand-crafted; (e) solitary; (f) holistic (g) slow, (h) portable, (i) technology independent, (j) errors invisible. BYTES: (a) public, (b) preoccupied with formats, (c)interoperatble; (d) machine-produced; (e) cooperative, (f) fragmented, (g) quick, (h) desk-bound, (i) technology dependent, (j) errors transparent.

Optimus: it looks like another table of yours Maximus: yes. In fact it is a fragment of a much large table, one involving 5 modes of thinking and 19 variables. Here it is [table will be shown here]

The fifth column is the environment to which we should strive to arrive.

Afranius: OK, so the data will be collected more easily, stabndardised, disseminated, and used. Will it make humanties and social sciences, Quadrivium and Trivium, a better place?

Maximus: I do not know, but if I see an area which needs repair, an improvement, and I fail to do so, I will be a lesser human being.

But here we, finally, are - in my garden. Can you see the freshly planted olive tree? A cheerful and energetic little fellow, no more than 1.5 m tall. An yet, already in a few months it will bear the fruit. Can you see how beautifully it fits the setting? How much sun it soaks up every day? How greay and silver are its leaves? How they shimmer and scintillate in the light? Doesn't it gladden your hearts? It certainly gladdens mine.

[to be continued ...] [and it, sadly, never was - tmc, 21 Aug 2024]

4. End Matters

Appendix 1: OWTRAD meta-data template

Appendix 2: A fragment of an OWTRAD notebook entry

Appendix 3: OWTRAD notation system (v.2.0, Oct 2000) [see http://www.ciolek.com/OWTRAD/notation.html]

Appendix 4: OWTRAD rules

Appendix 5: A specimen of an OWTRAD dataset - for instance
Ciolek, T. M. 2000. Georeferenced data set (Series 1 - Routes): Roads in India during Mughal rule 1556-1707. OWTRAD Dromographic Digital Data Archives (ODDDA). Old World Trade Routes (OWTRAD) Project. Canberra: www.ciolek.com - Asia Pacific Research Online. www.ciolek.com/OWTRAD/DATA/tmcINm1550.html

9. About the Author

Dr T. Matthew Ciolek, a social scientist and networked knowledge architect, heads the Internet Publications Bureau, Research School of Pacific and Asian Studies, The Australian National University, Canberra, Australia. His work and contact details can be found online at http://www.ciolek.com/PEOPLE/ciolek-tm.html

10. Acknowledgements

I am grateful to xxx for their critical comments on the earlier version of this essay.

11. References

[The great volatility of online information means that some of the URLs listed below may change by the time this article is printed. The date in round brackets indicates the version of the document in question. For current pointers please consult the online copy of this paper at http://www.ciolek.com/PAPERS/pnc-hongkong-01.html

12. Version and Change History


Maintainer: Dr T.Matthew Ciolek (tmciolek@ciolek.com)

Copyright (c) 2000 by T.Matthew Ciolek. All rights reserved. This Web page may be freely linked to other Web pages. Contents may not be republished, altered or plagiarized.

URL http://www.ciolek.com/PAPERS/pnc-hongkong-2000.html

[ Asian Studies WWW VL ] [ www.ciolek.com ] [ Buddhist Studies WWW VL ]