Classification of Land Use and Occupancy with Emphasis on Urban Areas Classificação do Uso e Ocupação do Solo com Ênfase em Áreas Urbanas

Urbanization in Brazil occurred in a disorganized way and without observing criteria that could ensure the sustainability of this process, resulting in high population concentrations in large urban centers. In this way, it is recognized the need for an adequate planning that contemplates the economic and social interests and respects the environmental demands. Thus, the present work aims to map the land use and occupancy, with an emphasis in an urban area, having as a case study the municipality of Novo Hamburgo, RS. The methodology is established in two levels of detailing of classes, in which supervised classification and manual vectorization were used, seeking a more accurate specification of the urban areas. Validation was established through two databases: Google Earth and Remotely Piloted Aircraft System (RPAS), using the confusion matrix and the Kappa index. The results show the efficiency of the hybrid method for high-resolution images, besides highlighting the existing anthropogenic differences in urban areas. In relation to the use of different databases for validation, close values are also perceived between the two processes, with a Kappa index of 0.928 for data obtained from Google Earth and 0.943 for the RPAS.


Introduction
The urbanization in Brazil occurred in a disorganized manner, resulting in high population concentrations in large urban centers and medium-sized municipalities, which contributed to the formation of the metropolitan regions in the country from the 1970s (Brito et al., 2001). This pattern of urbanization has consolidated two major characteristics in the major cities: the lack of sustainability related to the urban expansion process and the low quality of life (Grostein, 2001), as well as the considerable increase in the lack of basic infrastructure services, making it a characteristic of the historical process of formation of Brazilian cities (Machado, 2000).
The relation between environment and human activities has always been a point of discussion between public agencies and institutions (IBGE, 2013). Thus, it is important to recognize the need for an adequate planning that addresses economic interests and respects environmental demands while, at the same time, comes to be able to diagnose the current situation of the territory and identify possible future problems, using these results in the construction of new guidelines of planning and management (Nascimento et al., 2009, Kribert et al., 2016. Remote sensing can be used as an efficient tool in this construction because of its ability to collect information from an object or area of interest without necessarily having physical contact with the environment (Florenzano, 2011).
In the past two decades, different remote sensing bases became available, especially with the advent of orbital sensors of high temporal resolution, which made it possible to expand the analysis scale (detailing) in urban areas, improving the techniques of mapping the land use and occupancy. However, most of this information is still related to low and medium resolution images, which guarantee adequate analyzes at global and regional level, but are inefficient for local planning (Lu & Went, 2007;Rozon et al., 2015Almeida et al., 2016.
Changes in land use and occupancy in urban areas characterize one of the worst problems related to environmental issues, due to the high intensity of these processes occurring in metropolitan areas. The monitoring of these land changes establishes resources to support decision making in environmental urban planning and, consequently, in the management of natural resources.
Therefore, the use of high-resolution images is encouraged in order to obtain detailed information about urban structures, and thus facilitate the assignment of socioeconomic functions to different classes (Hu et al., 2016). In this context, the objective of this work is to map the land use and occupancy, with emphasis in the urban area, using high resolution satellite images, besides comparing two databases for the validation of the classification. The mapping establishes different urban mashes, resulting in a basic tool for planning, taking as a study of case the Municipality of Novo Hamburgo.

Materials and Methods 2.1 Characterization of the Study Area
The municipality of Novo Hamburgo is located in the metropolitan portion of the state of Rio Grande do Sul, at coordinates 29 ° 67 'South and 51 ° 13' West, integrating the Sinos Valley Region, an important economic and industrial region of the state. Distant 40 km from the capital Porto Alegre, it has approximately 238,940 inhabitants, distributed in a territorial area of 224 km² (IBGE, 2010) ( Figure 1). Novo Hamburgo was chosen as a case study, because it concentrates a medium-sized city, located in a metropolitan region, which has undergone an accelerated process of urbanization. This fact, associated to the shoe crisis that occurred in the 1990s in the municipality resulted in the formation of irregular occupations, as well as problems of infrastructure, around the main urban area are also recurrent in the city, highlighting the need for strategies and planning guidelines.

Classification Method
For the mapping of the land use and occupancy of Novo Hamburgo, 4 images of the Pleiades satellite were used, with passage on April 25, 2015 with spatial resolution of 0.50 m (Panchromatic) and The first step in the classification process consisted of georeferencing, using the Universal Transverse Mercator Coordinate System (UTM) and the WGS84 Datum. After that, the pre-processing of the images was fulfilled with the orthorectification and the multispectral bands with the panchromatic, followed by the cropping of the study area. Then, the hybrid classification method was used, which employs automated and manual techniques. Thus, layers were initially created to digitalize the classes, which were performed using the supervised classification method, which uses the pixel classification, based on maximum likelihood. The choice of a pixel by pixel classification method is linked to the use of a hybrid classification system, so the spectral confusion between some classes as well as the granular appearance were minimized by the manual process.
The manual method consists of the digitalization of the classes, as from the visual interpretation. This process requires a long time, but allows a greater precision in the mapping (Zhang et al., 2014;Kibret et al., 2016). The extension of study areas and the advancement of technologies make it unfeasible to use only this technique, since the researches tend to produce methodologies that allow replication, unlike visual interpretation, which consists of specific analyzes, with changes related to each case (Zhang et al., 2014). El-Kawy et al. (2011) also emphasizes that the combined use of supervised classification techniques and visual interpretation increases overall accuracy by 10%.
In order to achieve the recognition of the classes, elements of interpretation were used, like color, tonality, shape and texture. For the specific detailing of the urban areas, the size of the typologies and the lots, the proximity between them, as well as the shade generated by tall buildings were analyzed.
The thematic classes of the mapping were divided in two levels, elaborated from the studies fulfilled by Anderson et al. (1976), Almeida Filho & Almeida (2001), IBGE (2013) and Batista et al. (2013). Level I establishes a simplified categorization, which allows to encompass the main categories of land cover, based on the specific analysis of remote sensing data (IBGE, 2013). Level II establishes more accurate information for the mapping, requiring high resolution images (Table 1).
The Level 1 Map was built based on the supervised classification, where Rural Areas, Vegetation Areas, Urban Areas and Hydric Resources were identified. Due to the high resolution of the Pleiades images, not every pixel of the urban áreas and the hydric resources were classified correctly because of the digital levels of brightness of the pixel. Thus, the classification was assisted by the manual process,  through the vectorization of these two classes. The second procedure was the detailing of these areas in order to establish the Level II classes, which were elaborated exclusively from the manual classification.

Validation Method
In order to validate the reliability of the results, aerial images of the Remotely Piloted Aircraft System (RPAS) were used. Due to the limited area of the photos obtained with the RPAS system, it was also chosen to complement the validation process with data extracted from Google Earth, aiming at establishing the quality of accuracy for validation of classification maps.
For the images obtained by the RPAS, the model SPYDER SD6 XL was used at an altitude of 100 meters with a resolution of approximately 2.5 cm. In this way, 6 aerial images referring to the year 2015 were analyzed, from which 21 control points were extracted and identified. The methodology proposed by Kibret et al. (2016), which employs a window of 3x3 pixels on each control point was used, in order to reduce the error in the validation of the observed data and the terrestrial reality. In total, 189 pixels were analyzed.
The validation method consisted in creating a mathematical representation composed of an error matrix, where the columns represent the data collected (real data), and the lines are the classification produced with the photointerpretation. After the cross tabulation was performed between the data, the main diagonal indicates the level of agreement between the two input information. For the creation of this matrix, the "Spatial Join" command was used, which allows us to join spatial information, followed by the tools: "frequency" and "pivot table".
The statistical validation of the results was performed in the Excel® software based on the Confusion Matrix, using the Kappa index (Cohen, 1960), which allows to categorize the accuracy of the land use mapping, in order to establish the probability of a pixel being correctly classified in relation to the incorrect classification (Equation 1).
Where nii is the value in row i and column i; ni + is the sum of line i and n + i is the sum of column i of the confusion matrix; n is the total number of samples and k is the total number of classes.
The quality of the classification associated with Kappa statistics was assessed by the accuracy levels shown, elaborated by Landis and Koch (1977), which classifies values between 0,80 and 1,00 as excellents.
In parallel to the construction of the validation through the RPAS, an analysis was made with points extracted from Google Earth. The use of images of Google Earth in recent years has become an important instrument for validation of current mappings, given the availability of data with adequate resolution, however there is still no consensus regarding the number of points and the definition of the same as observed in the works of Kibret et al. (2016) and Hegazy & Kaloop (2015). Thus, for the image of Google Earth it became necessary to consider the number of samples to be validated through the central limit theorem in a normal distribution. Initially we identified the number of pixels in the Pleiades image, which corresponded to 887,967,571 pixels, that is, the sample universe. From Equation 2, the number of samples required was calculated, using a confidence of 95% in the result, being necessary the validation of 385 sample points.
With the number of samples needed, it was possible to establish the number of points for validation using the tool "create random points" in Arc-GIS® software, which allows you to create random points within a polygon. In this way, the municipality of Novo Hamburgo was used as a delimiter for (2) (1)

Classification of Land Use and Occupancy with Emphasis on Urban Areas
Roberta Plangg Riegel; Darlan Daniel Alves; Leonardo Espindola Birlem; Douglas Cristian Roque; Guilherme Garcia de Oliveira; Claus Haetinger; Daniela Montanari Migliavacca Osório; Marco Antônio Siqueira Rodrigues & Daniela Muller de Quevedo the distribution of the 385 points, which were exported to the kmz extension, allowed to be viewed in Google Earth, about images from the same period (2015). The analysis of the classes of each point consisted in the use of a second observer, endowed with the information described in Table 1, which made it possible to indicate the classes in Excel® worksheet.
In the ArcGIS® software, the information contained in the spreadsheet has been transferred to the attribute table of points. Based on this table, a confusion matrix was constructed, which was again evaluated by the Kappa index, according to the method explained above.
Based on the two validations, a comparative analysis was carried out, in which the reliability of the information and the efficiency of the two databases can be highlighted.

Analysis of Land Use and Occupancy
The Figure 2 presents the Level I mapping, obtained through the hybrid method, in which the vegetation and rural areas were identified by the automatic process (supervised classification) and the remainder was defined manually, to reach the levels of accuracy required. The map in question has a character of simplification of the classification of the land use, in order to allow comparisons, with classifications of images of low resolution. In the same figure, it is possible to realize the totality of the urban occupancy in the northern part of the municipality, the immense rural area that intertwines with environments of vegetation in the southern portion, and the areas covered by water resources, in which are included the rivers, the ponds and the reservoirs, totaling a transition environment between the consolidated area and rural area.
Based on the detail of the area (Figure 3), especially the urban area of the municipality, it is possible to observe the interactions of the urban mesh. Highlighting the great area in consolidation, which is characterized by medium and high density. It is also realized the vast fragmentation of the mesh in different classes, demonstrating the need for a differential treatment in spatial planning, since the soil permeability and the rainwater flow capacity is directly associated with this issue. The high concentration of density also establishes the need for appropriate infrastructures, especially in matters related to water distribution, waste collection and sewerage. The imbalance between environment and urban occupation is expressed in this image, since the appropriation of the soil has developed in an accelerated way and without specific controls of use.

Classification of Land Use and Occupancy with Emphasis on Urban Areas
Roberta Plangg Riegel; Darlan Daniel Alves; Leonardo Espindola Birlem; Douglas Cristian Roque; Guilherme Garcia de Oliveira; Claus Haetinger; Daniela Montanari Migliavacca Osório; Marco Antônio Siqueira Rodrigues & Daniela Muller de Quevedo The wetlands are also relevant in the mapping, mainly analyzing the environmental aspects. Occupancy in these areas causes loss of services, quite beneficial to society, in other words, it reduces groundwater recharge, the maintenance of biodiversity, the regulation of the local climate, the storing and the cleaning of water, the regulation of biogeochemical cycles, the carbon storage and the habitat for a high number of species (Junk et al., 2011). The rural and vegetation areas are predominant in the southern portion, which also underscores the need for care in directing the urban expansion of the municipality.
The Table 2 presents the areas for each class, in which it is possible to highlight the increase of urban areas, from 21% in 2009 (Riegel & Quevedo, 2015) to 23.9% in 2015. In relation to the specific classifications of the urban region (3.77%), consolidated (2.97%) and parcels with vegetation (2.97%), the areas under consolidation with 12.79% of the territory, followed by subdivided areas (3.77%); smaller areas are still included, which totaled 1.53%. The rural areas had the highest percentage of the territory, totaling 39.67%. This result is due to the vast areas of field and transition (31.65%) that permeate between the urban territory and the areas of vegetation. In relation to these areas, the results also showed a high value of 30.40%, however, it is not possible to state that these environments are composed only of native vegetation, since this variable was not analyzed. Finally, the areas composed by water resources comprise 6.03% of the territory, with emphasis on the wetlands that are significant in the region, with 5.02%. Despite the anthropogenic pressure associated with the metropolitan region, there is an immense area still not consolidated, composed of important ecosystems that help to balance the environment. The perception of the need for special care with urban areas, and especially with areas threatened by expansion, is fundamental in a metropolitan scenario, leveraging the use of efficient strategies for territorial planning.

Validation Analysis
The Confusion Matrix obtained based on the 189 points evaluated by the RPAS images presented a Kappa index of 0.943, being classified as excellent   Landis & Koch (1977). The classification errors are directly associated in the crossing between the classes, forest formation (16) with reforestation (15).
The Confusion Matrix generated by the validation of the map of land use and occupation, on level II, through random points on Google Earth, was elaborated based on 385 points, which resulted in a Kappa index of 0,928, also classified as excellent according to Landis & Koch (1977). The matrix allowed classification errors to be observed at the intersection of the following classes: consolidated urban area (1) with urban area under consolidation (2); area in consolidation (2) with industrial area (5); urban area with vegetation with forest formation; exposed soil (12) with field area (13); field area (13) with cultivation area (14); field area (13) with reforestation (15); field area (13) with forest formation (16); field area (13) with wet area (17); cultivation area (14) with exposed soil (12); reforestation area (15) with forest formation (16); and forest formation (16) with wetland (17). However, the main errors on the classified map were in the field areas, in the urban areas in consolidation and in the reforestation formation regions. Finally, the mapping validated by this method was efficient due to the high value obtained in the Kappa index, reaching satisfactory aspects for its use.
The two databases for validation had similar Kappa indexes, and were satisfactory for its use. The use of Google Earth triggered a lengthy review process, which resulted in manual ranking of 385 points making it unfeasible for larger areas and the same resolution. However, the use of aerial images obtained in the field, despite the reduced number of points, allowed to have appropriate results, but it triggers other financial resources such as the use of RPAS and fieldwork.

Final Considerations
With the study, it can be understood that the use of automated methods, associated with manual vectoring, are adequate for the classification of high resolution images, since the validation coefficients were considered excellent. The results also demonstrated relevant environmental aspects, such as wetlands, cropping areas, and vegetation environments, important for consolidating strict guidelines for territorial occupancy. The urban mesh, formu-