THE EMPIRICAL PROCESS OF MAPPING BLTs: TWO CONTRASTING CASES
Preparing the datasets of Winchester and Chunchucmil (1)
From here on, this chapter mainly follows the workflow of mapping practice that creates BLT data. Progress is tracked by the bracketed numbering in the headings. This workflow can be summarised heuristically in the following steps:
Preparation of datasets: acquiring, assembling, digitising and converting the source materials to the same format (usually concerns pre-existing or legacy spatial data and/or maps);
Mapping (tracing) the outlines of major occupiable subdivisions of the built environment as represented in the source material to create equivalent spatial information as the foundation of the outline base plan;
Case-specific conjecturing to resolve any remaining data gaps and ambiguities (data needs to be spatially contiguous), and subsequently revising the resultant outline base plan to ensure equivalent spatial data with topological integrity;
Identifying the BLTs by remapping (tracing) the outline base plan with conceptually validated individual data entries (polylines), while also revising and correcting the resultant spatial data structure to assure topological integrity.
In practice it can be expected that these steps, while presented discretely, bleed into each other reflexively (especially steps 2–3 and 3–4). This will be referred to in my report below on how these general steps play out in case work.
The initial stage of preparation means converting the mapped data to an equal and appropriate digital format across all datasets. This technical format needs to be achieved before the further steps are taken, in order to ensure that the datasets will basically represent the same level of conceptual detail and ultimately follow the same conventions to convey information. Chapter 6 highlighted the methodological and integrative potential of GIS software. Here ESRI’s ArcGIS, version 10, was used as the primary GIS environment to conduct all methodologically specific mapping. ArcGIS was selected because of its widespread use in academic contexts, both in geography and archaeology, as well as its versatility in handling topological information in vector data (i.e. lines, points, and polygons. Lines and their connections are of paramount importance here). Nonetheless, for digitisation and data conversion other software packages were used and will be named when relevant.
The following account is provided with the aim of enabling those pursuing a similar workflow. The four steps are somewhat simplified. When working with increasing familiarity on each particular case and each individual data source, rules of thumb pragmatically emerge in the data creation processes that resolve ambiguities, uncertainty, and confusing data situations with certain degrees of subjective judgment. For our two cases and their data sources, the rules of thumb that emerged in this research are submitted at the end of this chapter. Besides such case-specific particularities, the general steps of the workflow can be pursued as stated. In this account the technical details are kept concise, and generic information on digital work is omitted. I will focus on the sequence of work and decisions for data preparation that produced usable results in the variety of situations that my test cases comprise. It can be expected that various software-based work sequences will be superseded by software updates. That means that the principles are more important than my precise actions within the software. Where instrumental for the results, the processes will be described.
Winchester maps
The Winchester city plans used consist of both the present-day situation and historical situations. The contemporary situation is based on the current large-scale mapping standard of the British Ordnance Survey, called MasterMap (from here on: MM). MM is a digital product of the Ordnance Survey (from here on: OS), updated up to every six months. The version for Winchester used was downloaded at the end of October 2011 (University of Leeds academic license for EDINA services) with the OS providing the complementary OS Imagery Layer (OS official aerial photography) and OS Address Layer (version 2) on disc in April 2012. The first historical time-slice is based fully on the first edition of the large-scale (1:500) OS city plan published between 1871 and 1872 (from here on: OS1872) (University of Leeds academic license for EDINA’s Historic Digimap). This is in keeping with common practice established by urban morphology (Chapter 6). The additional time-slices are sourced from the reconstructed plans for the later medieval period by Derek Keene (1985), respectively for around 1300, 1417 and 1550. Within the confines of testing the methodology the Winchester case does not revert further than the period around 1550 (from here on: 1550s), which demonstrates the same principles as would apply for the processes required to use the two further possible time-slices. Future research could also consider making the temporal resolution between time-slices more fine grained.9
To emphasise, the three city plans thus used are each of a different nature. MM is a born-digital plan, fully enabled to convert into GIS formats and visualisations, and represents the best British national mapping standards of accuracy. OS1872 is, while produced to be accurate, essentially a historical document. It is acquired in geoTIFF format (i.e. TIFF image files with a basic level of georeferencing: projecting, locating, and scaling it), containing the digital scans of original sheets. Finally, 1550s is a historically reconstructed map, dependent on the academic cartographic and historical research practice producing it.10 Interestingly, the mapping standards of OS1872 feature more detail than MM. In contrast, the burgage plot based historical research of Keene (1985) to produce 1550s provides only a basic level of detail, especially omitting the architectural morphology that would give us building outlines. Regardless, before any data standardisation can take place, OS1872 needs vectorisation and georeferencing to MM, while 1550s needs digitisation and vectorisation to be geospatially linked to the other two.
Starting with 1550s: the Keene plans of medieval Winchester needed digitisation. Keene (1985) reproduces them at a 1:2500 scale, separated out in numerous small sections. While these could be scanned from the books and digitally stitched together, the match errors of so many seams would compromise the quality of the resulting plan. Therefore the originals were tracked down at the Winchester Research Unit (curated by Martin Biddle and Katherine Barclay), which stores them in the depot of the Winchester City Museum. These originals consist of large sheets of film on which the line drawing of the map was draughted. These sheets display the medieval city in only five parts: the walled area; and the north, east, south, and west suburbs. Their large-scale as well as their less fragmented and unannotated nature would likely increase the quality and direct usability of the digital end product tremendously.11
To avoid photographic lens distortions, digitisation was carried out using roller scanners. These scanning machines scan large physical documents in flatbed fashion. The large high quality 400–600 dpi resolution raster images needed cleaning and filtering to remove digital noise and original blemishes on the films, enhancement for contrast and definition, and stitching together to compose one entire city plan.12
Accepting MM as the standard of accuracy, georeferencing and georectification of the historical time-slices are carried out in direct relation to MM. This is alternative to a practice where a proper set of control points are set up onsite with dGPS (differential GPS, cf. Lilley 2011a). GPS error margins could cause unwanted discrepancies between the points taken and MM, which would require superfluous rectifications of MM in addition to the historical layers. Instead, assisted by Keene (pers. comm. 2011), historically persistent points in the current built environment were identified and photographed onsite for future reference. These historically persistent locations and photo directions were then documented as a GIS layer on top of MM as point data (the photographs themselves show little context). According to expectations of urban development, fewer points persisted from the 1300s than each more recent time-slice. The historically persistent points served as an initial set of control points for georeferencing and georectifying the historical layers.13 Although OS1872 is delivered with a basic level of georeferencing, to achieve a closer geographical match with MM these control points were used on that time-slice also.
Initial georeferencing of OS1872 using these control points enabled me to pick out a series of additional points across the time-slices that clearly related to specific corner and intersection locations in MM. Employing ArcGIS proprietary higher order georectification warps, through an iterative and visual process of selecting appropriate points and warps, such additional points improved the relative accuracy of each time-slice. In the georeferencing process errors cannot be avoided (see the result in Table 7.1). An additional error is introduced as an effect of OS1872 consisting of multiple sheets. The sheets of OS1872 were published separately over two years, while the city was developing at a rapid rate, causing imperfect matches (see Fig. 7.2). The rectification can be fixed by transforming the raster file14 (most effective is saving to TIF with LZW compression), which creates a new raster dataset incorporating the warp. On this basis vectorisation takes place. Since for OS1872 this immediately entails extracting the base plan, this is described later.
Fig. 7.2
Mismatch at a plan seam in OS1872.
The intrinsic mismatches at the seam (horizontal line in the middle) between two sheets in OS1872 cause inevitable errors in georeferencing. (Image extracted from originals: © Crown Copyright and Landmark Information Group Limited 2013. All rights reserved. 1872.)
Table 7.1 Results of georectification for the OS1872 time-slice.
Plan | No. of Points Used | Warp | RMS error |
---|
OS1872 | 74 | Adjust | 0.03241 |
Taking into consideration the intensity of the mapping processes to carry out initial methodological tests for diachronic comparisons, a small test area (approx. 175x200m, Fig. 7.3) was selected to proceed work on Winchester. This area was deliberately chosen to include an intramural and extramural part of the city where the city wall has been removed, so it would show clear contrasts between persistence and change. The eastern part of the city centre, around the former East Gate and bridge, offers good diversity of spaces within a historically well-developed suburb of the city. Nonetheless, note that a small section could never incorporate the full variety of spatial morphology within the urban built environment concerned.
Fig. 7.3
The approximate location and extent of the test case area indicated within MM of the historical core of Winchester.
(Extracted from OS MasterMap. © Crown Copyright 2013. All rights reserved. An Ordnance Survey (EDINA) supplied service.)
For the next historical time-slice, 1550s, the approach differs from OS1872. As original scans, after image processing (cleaning, enhancing and stitching) the file is as yet completely ungeoreferenced. Since the image file contains an unannotated line drawing (similar in nature to Fig. 7.2, without text), classification in two value classes only (i.e. a bi-tonal image) would make it susceptible to automated vectorisation. Thus prioritising vectorisation, I established separate feature classes (geodata files) to distinguish Keene’s (1985) own original historical conjectures from urban features he deemed certain at the time. On this basis, 1550s was vectorised before georeferencing and georectification, thereby significantly improving the manageability of the file size in the ArcMap environment.
Gregory & Ell (2007) warn that although in principle the historical researcher’s best friend, automated vectorisation is not sufficiently effective in practice to take over vectorisation. The extent of manual editing afterwards would be equal to the manual vectorisation process. In spite of several years of development, even on the very clear line-drawn 1550s map I had to decide that fully automated vectorisation could not be trusted. Issues occurring include: undue cessations along thinner lines, directional confusion along thick lines and unintended disorder along dashed lines (the software does not recognise drawing conventions). However, ArcGIS’s ArcScan tools provide a semi-automated form of vectorisation, which significantly speeds up the manual tracing of the original image with polylines. This process still requires human intervention to avoid improper ruggedness in the shape of polylines derived from thicker originally scanned lines. The upside is, however, that one has direct control over the data produced, significantly reducing the aforementioned errors from automation.
Confusingly, the geoprocess akin to georeferencing raster image files is called ‘spatial adjustment’ in ArcGIS when it concerns geographically relating vector data to another dataset (here the vectorised 1550s layer to MM). Fortunately, spatial adjustment operates on very similar principles and thus ends up being quite intuitive for those familiar with raster georeferencing. Because in spatial adjustment snapping exactly onto vector data nodes is enabled, much more accurate placement can be achieved (directly connecting the node within 1550s with the respective node in MM). When the internal scale between the vector datasets is equal, the remaining error should come out nought between co-located nodes (i.e. in the exact same geographical location across layers). Where one is certain a selected node is identical between the two vector layers, these can be fixed as a geolocated connection (hammering in a virtual nail to join both layers) called an ‘identity point’. Now, in subsequent geoprocessing to warp the dataset, this point cannot move from its position (contrary to control points in georeferencing). This warp process is called ‘rubbersheeting’, which entails the stretching of vector data between the identity points based on additional control points added locally to achieve a more precise match. Determining 42 identity points in total over an area of approx. 600x600m, encompassing the test case area, proved sufficient for the successful processing of the data assisted with locally added control points. No residual error is produced in this process.
Chunchucmil map
In contrast to the Winchester plans, the Chunchucmil map results from an original archaeological topographical surface survey. This intensive process entailed pacing from the corners of a 20x20m grid system with a compass to map archaeological remains. This grid system was based on a pre-existing grid left by henequen cultivation expanded with additional grids using theodolite measurements, and connecting them up using high precision GPS (Hutson pers. comm. 2012). The archaeological plan was acquired directly from Scott Hutson (courtesy of the Pakbeh Regional Economy Program) in digital format. Hutson directed and completed the mapping project on Chunchucmil, taking over from Bruce Dahlin. Frequent contact with Hutson was invaluable to preparing the GIS, and using and interpreting the plan. Being the product of an archaeological survey, it contains the interpretations and professional judgments of the mappers. In mapping archaeological remains the result comprises a representation of an empirical material situation as encountered onsite. At the same time, the exact condition of the onsite empirical situation cannot be conveyed just by the lines composing the map. In order to better understand why the mapped lines appear as they do, i.e. their characteristics as lines rather than what the legend tells us they convey, contact with Hutson was indispensable.
As said, for Chunchucmil’s archaeological plan I will assume the synchronicity of ca. sixth-century occupation. The map cannot serve for diachronic comparisons. The abandonment of the city left traces of a maximum phase of occupation covering a large area contiguously. No large Maya site has ever been excavated in its entirety, and investigations into earlier phases of development is typically confined to monumental architecture in the centre and individual buildings (Fash 1998). Such research indicates that monumental architectural successions often consist of superposing a new phase onto the preceding one. Andrews (1975) shows the hypothetical evolution of a ‘quadrangle group’ of buildings, in which the group increasingly clots together with elaborate architectural volumes from several related but separate buildings. Ultimately, we know precious little about the development of cities on a settlement scale. In the case of Chunchucmil work done on the chronology of the settlement, based on limited excavations, suggests a ‘filling in pattern’ that maximised the system of albarradas (Stanton & Hutson 2012). The finger or corridor extensions leading to outlying satellite centres of settlement appear to have been actively occupied during roughly the same period as the rest of the city (Hutson et al. 2008). Only additional archaeological research could enable efforts towards reconstructing earlier phases of the urban built environment.
The main purpose of the Chunchucmil test case is to demonstrate the compatibility and effectiveness of applying BLT Mapping to archaeological data, revealing radically different urban traditions. To account for the relative unfamiliarity with this urban tradition and the lower density of built features, a considerably larger area was selected to conduct the test case at Chunchucmil. On Hutson’s (pers. comm. 2012) advice, a test case area was selected on the northwest side of the monumental core (see Fig. 7.4), where consistent observations during mapping raised the expectation that preservation is slightly better than for other parts of the site. The test case area covers approximately a square kilometre northwest from the site’s mapping centre, overlapping a small section of the monumental core. This represents almost a tenth of the total contiguously mapped area of the city and is intended to contain a reasonable proportion of the spatial morphological variety of the built environment.15 With an eye to up-scaling the test case to a full-blown case study in the future, the whole map was subjected to the initial data preparation.
Fig. 7.4
The approximate location and extent of the test case area indicated within the archaeological map of Chunchucmil.
Please note that in this overview map the detail of the archaeological survey has been simplified. (Base map courtesy of the Pakbeh Regional Economy Program with help from Scott Hutson.)
The mapping of Chunchucmil took place over a period of 12 years, during which many team members were involved in the work. Contrary to more recent archaeological practice, it was early on decided that the city’s plan would be drawn up in Adobe Illustrator (.ai extension). This is not software with GIS capabilities, but visually oriented graphic software, albeit functioning in vector format. This means that although the Chunchucmil plan concerns born-digital data, none of that data is geospatially stored. Therefore the data had to be converted to an ArcGIS proprietary format, and geospatially located and projected before further work could commence. Unfortunately the .ai format could not directly be imported in ArcGIS.
This inability necessitated a laborious conversion process for legacy Adobe Illustrator data, which was originally set out by Wunderlich & Hatcher (2009). This process could roughly be followed, but software updates make the processes here slightly different. Most of the process takes place in Adobe Illustrator itself, which serves to prepare the data for conversion to other formats and to avoid conflicts or corruption at that stage. Since software is constantly changing, this process is not reproduced in full. The generally important steps include the separation of all image layers, especially to separate out different kinds of digital information (e.g. lines, text, fills). To preserve the shape of automatic visual renders (e.g. curves) of drawn features, the distribution of the anchor points (vertices) needs to be densified. This way the locations of points giving a polyline its more precise shape can be maintained in other formats. Adobe Illustrator can then export the separate layers to AutoCAD formats.
Following Wunderlich & Hatcher (2009), a hereditary AutoCAD exchange format was used (the 2000/LT2000 version for .dxf), which is assumed to store information in a simpler and more stable way than newer versions. Interestingly, no stage of the process requires the operation of a version of AutoCAD software itself (although one might want to check the condition of the data). ArcGIS is then able to import .dxf files, but for unclear reasons the ArcGIS proprietary conversion tools produced grossly compromised results, beyond easy repairs. Through trial and error it was found that MapInfo Professional’s Universal Translator tools produce reliable results. Here the file converts first from .dxf into Mapinfo’s proprietary .tab, and subsequently from .tab the same tool can convert to .shp (i.e. shape file) developed for ArcGIS and other GIS packages. These shape files, finally, can be loaded in ArcGISwithout issue (text annotations still remained unsuccessful throughout this conversion).
Table 7.2 The residual errors of georeferencing the TIFFs from the Chunchucmil plan data.
Raster file (coverage) | RMS error |
---|
Chunchucmil’s entire plan | 0.01945 |
Block 0 | 0.00543 |
Block 1 | 0.00952 |
Block 2 | 0.00339 |
Block 3 | 0.00887 |
Block 4 | 0.00030 |
Block 5 | 0.00072 |
Block 6 | 0.00274 |
Block 7 | 0.00574 |
Block 8 | 0.00605 |
Block 9 | 0.00329 |
In addition to converting the .ai data to sufficiently reliable .shp format, the .ai map was converted to PDF, and in turn in Adobe Photoshop converted to TIFF. Since the originally shared map only showed a partial grid around the site’s centre, the same was done for the PDFs (provided at a later date) containing the 10 gridded blocks with labelling in which the site plan was organised. These blocks provide coded references for mapped features to improve navigation and referencing across the city plan.16 Adding a raster image layer of the whole city plan as a dataset in ArcMap enables essential visual checks for the integrity of the converted vector data, and shows the annotations (labels) that did not convert well in the earlier process, aiding interpretive work. After assigning the correct projection to the imported raster data, using the coordinates for the site’s centre point, the TIFF containing the entire plan could be georeferenced.17 Knowing the partial grid across the centre consists of 250x250m blocks, the georeferencing could be scaled (using five points on the grid in quincunx fashion). This can subsequently be extended to include the grids of the 10 label blocks by using the four extreme corners of each grid. The results of this georeferencing process can be found in Table 7.2.
Next, the generated shape files containing the vector data of the original plan were imported as layers in the GIS. Using the spatial adjustment tools as described for Winchester, referring to the four extreme corners of the partial centre grid (this grid was included as part of each separate .ai vector layer before), each layer could be displaced and scaled exactly (i.e. literally without processing error) onto the corresponding coordinates. With the vector layers overlaying the raster images, the quality and integrity of the vector data conversions could be checked for each detail as well as for overall completeness. On inspection, only few minute details seemed to be missing (likely due to visual rendering techniques). As relatively easy manual edits could resolve any issues in subsequent processes preparing data parity (below), the data were deemed fit for use.