R is a language and environment for statistical computing and graphics and is freely distributed under the terms of the GNU General Public License [?]. It is similar to the S language which was developed at AT&T Bell Laboratories, but they have important differences in the designs.
R provides a wide variety of statistical and graphical techniques, is highly extensible having interface with procedures written in C/C++ or FORTRAN. A web site with further information can be found at http://www.r-project.org.
TerraLib is a Geographic Information System (GIS) library written in C++, developed by Instituto Nacional de Pesquisas Espaciais (INPE), available from the Internet as open source, allowing a collaborative environment and its use for the development of multiple GIS tools [?]. It defines a geographical data model and provides support for this model over a range of different Data-Base Management Systems (DBMS). A web site with further information can be found at http://www.terralib.org.
An example of application that use TerraLib class library is TerraView. It is a Geographical Application tool, with spatial analysis capabilities, and is also licensed as free software under the GNU General Public License. It can be downloaded together with TerraLib.
aRT (API R-TerraLib) is a package that provides the integration between the softwares R and TerraLib. aRT is still a prototype with some basic operations which prove that the integration is possible. The idea is to have a package that uses the statistical analysis provided by R and the geographical data model and database support by TerraLib. A web site with further information can be found at http://www.est.ufpr.br/aRT
The main motivation for the package development is to facilitate the exchanging of information between the spatial packages in R (see http://sal.agecon.uiuc.edu/csiss/Rgeo/) and the TerraLib ability to manage and perform some spatial operations on the database. For instance, data can be easily moved between R and TerraLib. This way a data analyst could, for instance, import the data to R, perform some analysis using a spatial package such as spdep, splancs, gstat, geoR, among others, and return the results to the database. Those results could them be accessed by a GIS software such as TerraView.
Section 2 lists the aRT requirements and dependencies. To start with the development we have defined six basic operations which are described in Section 3 with examples of the capabilities of aRT.
aRT is being developed under a GNU/Linux-Debian platform and do not have guarantees to work in other one. This prototype still doesn’t have (yet...) an autoconfigure, so the configuration must be done manually. The following softwares/libraries are necessary:
There is a script to download and install MySQL/Qt/TerraLib under Debian/Linux along with the package.
Once we installed MySQL, Qt, TerraLib and R in the directory /usr/local, the following environment variables should be placed, i.e, in the .bash_profile or .bashrc file of the user’s login directory. Change the directories according with the installations’ pathes.
# Default directories (TerraLib, libmysqlclient.a and libR.so),
# used to __Make__ aRT: TERRALIBDIR=/usr/local/terralib LIBMYSQLCLIENTDIR=/usr/lib LIBRDIR=/usr/local/lib/R/lib # TerraLib Shared libraries, used to __execute__ aRT: LD_LIBRARY_PATH=$TERRALIBDIR/terralibx/terralib:\ $TERRALIBDIR/terralibx/tiff:\ $TERRALIBDIR/terralibx/shapelib:\ $TERRALIBDIR/terralibx/stat export TERRALIBDIR MYSQLDIR LD_LIBRARY_PATH LIBRDIR |
After installing aRT and starting an R session, load the package with the command source. If the package is loaded successfully a message will be displayed.
aRT has four classes to manipulate TerraLib data/functions: aRTconn, aRTdb, aRTlayer and aRTtheme. The next subsections explain each class in details.
Once the package is loaded, we need a DBMS connection. It can be done creating an aRTconn object. The constructor of aRTconn class gets the arguments user, password, host and port, and their default values are USER variable, empty string, “localhost” and 3306, respectivaly. For example:
After the object con is created, the variables it contains cannot be changed. If you need to set them, the only way is to create another object. This occours because data is stored in a external pointer, but we will not explain these things here.
One aRTconn object stores a virtual connection, i.e., all time that a database access is required, it connects, do something, and then disconnect. The objective of this class is to do some database administration functions, and open real connections. For example, if it is the first time you are running aRT, maybe you need to add permissions to some users. To do this, use addUser():
Warning: this function gives ALL permissions to a user. If you want to do something different, you need to run mysql for yourself, and use the GRANT command.
With an aRTconn object, you can also see the databases available and remove them. The next example shows the databases and tries to remove the database parana if it exists:
To create a new database, or to access one, there is the aRTdb class. Objects from this class stores a real database connection, and needs an aRTconn object to be created:
This constructor tries to load a database with name parana. If it does not exists (the true, once we removed it), it checks for create, trying to create a new one. Once this object is created, it depends no more of the con object.
aRTdb objects contains all TerraLib objects in memory needed by aRT. This means that all objects opened from it depends on it, even after they are created in R. The last line of print shows the number of children this object has. If this object is removed from R, all childrens becomes invalid objects when R’s garbage collector remove this object from memory.
To work with data in aRT, we need to manipulate layers. A layer can store any geometry of one kind (points, polygons or raster for now, lines and cells in the future), and attributes. Layers are TerraLib abstrations that uses tables of data and tables of control in one database. So they can be created from aRTdb objects.
There is an argument proj in the constructor that says which projection the layer data is. The defaul value is plan, meaning that the data can be drawn as it is. The other option (until now) is gepgraphic, meaning that the data is in degrees. Then we need to convert the data before plot it. ((Referência?? Simone??))
To insert data in the layer, we will use the bodmin dataset, part of splancs package.
Before insert into the database, we must convert the data to aRT format. aRT has some functions to convert data from other spatial packages (splancs and geoR, actually). This functions have the format <pkg>2aRT<datatype>, where pkg can be sp or gr, and datatype can be one of points, polygons or attributes. As example, the next code converts bodmin data from splancs to aRT, and inserts it into the database1 :
To insert the evolving polygon, we will create another layer:
Finally we will do a kernel analysis, and insert the raster data into the database, in another layer:
Finnaly, there are three layers created, children of db, as can be seen in the next code:
All add functions recieve an argument close = TRUE, telling if it is the last time the data will be inserted in the layer. Once the layer is closed, no geometry can be added to it2 . You can implicitly close the layer and create the table calling createEmptyTable(). After close the layer, attributes can be inserted.
To get the layer’s geometry call getGeometry, and then you can plot it. But if you don’t need the data the layer can be plotted directly:
You can plot different layers, using add = TRUE.
The last class implemented in aRT is aRTtheme. Themes can be visualized in TerraView software, and are (until now) useless for non-TerraView users.
Now we will create themes of points and polygons, and put them in the view view:
There is an argument that can be used in raster themes: the colors configuration. It can be used as in the next example.
There are two kinds of removing aRT objects: from memory and from database. aRTdb objects stores all the memory of aRT objects. aRTlayer only have pointers to their aRTdb. To remove data from memory is just call rm, and (for aRTdb’s), if you want to free the memory call gc explicitaly. Note that, once an aRTlayer object needs an aRTdb, if the garbage collector removes one aRTdb object, all the aRTlayers opened from it will become invalid.