A statistical assessment of maritime socioeconomic indicators for the European Atlantic area

Ever since the introduction into marine and maritime policy strategies worldwide of the relatively new concept of Blue Growth there has been an increasing interest in developing integrated systems of indicators for the maritime economy. The Marnet project has been a recent attempt to develop a comparative maritime socioeconomic framework for the European Atlantic area and its database contains a large number of socioeconomic indicators for many maritime activities at different territorial levels that provide the information needed to help analyze and compare the maritime economy of the European Atlantic regions. However, there are still many gaps with respect to the spatial and sectoral coverage of the statistical information available. This paper aims to assess the statistical coverage of the main maritime economic sectors in order to contribute to filling these gaps. To help determine where future statistical efforts should focus the paper gives a list of indicators classified by maritime sectors and activities with information on the degree of territorial coverage of each indicator as measured by the percentage of EU Atlantic regions with data at each territorial level. Based on this information, a list of failed indicators is presented in terms of EU Atlantic countries with no data plus the percentage of Atlantic regions with missing data in the rest of EU countries. Also, a Data Envelopment Analysis based statistical method is proposed to evaluate and compare the relative importance of each maritime sector on the European Atlantic economy. Finally, variation among both sectoral and regional DEA scores is also discussed with the help of a combination of distribution and box-and-whisker plots, as it may offer novel insights into the influence of the maritime economy on the European Atlantic area. This research article is available in Journal of Ocean and Coastal Economics: https://cbe.miis.edu/joce/vol2/iss2/4

A statistical assessment of maritime socioeconomic indicators for the European A statistical assessment of maritime socioeconomic indicators for the European Atlantic area Atlantic area

INTRODUCTION
There has been in recent times an increasing interest in the analysis of the importance of the oceans and their resources for human development and economic growth. In this sense, the introduction into European Union marine and maritime policy strategies of the Blue Growth concept identifies the maritime economic activities as crucial drivers for growth and jobs for the EU economy (COM, 2014). Thus, according to EU maritime affairs policies, Blue Growth is the long term strategy to support sustainable growth in the marine and maritime sectors as a whole. In this sense, seas and oceans are considered drivers for the European economy with great potential for innovation and growth that can help to achieve the goals of the Europe 2020 strategy for smart, sustainable and inclusive growth (European Commission, 2015).
The evaluation of the maritime economy through monitoring of its socioeconomic sectors in order to help the process of policy making needs empirical support that provides basic data and this has led to the publication in recent years of studies that have attempted to quantify the weight of the maritime economy in different countries (Kildow and McIlgorm, 2010;Foley et al., 2014;Park and Kildow, 2014). However, one problem is that official statistics are not specifically designed to measure the economic contribution of the oceans and consequently the results obtained are not necessarily comparable due to the different selection from country to country of the economic activities, classification systems, data collection methods, time periods or territorial levels that constitute the maritime economy (Kalaydjian, 2009;Kildow and McIlgorm, 2010;Surís-Regueiro et al., 2013;Park and Kildow, 2014;Zhao et al., 2014).
In this context, the main objective of the Marine Atlantic Regions Network project (Marnet project, 2014) was to develop a coherent framework for a maritime socioeconomic database with a robust methodology for the collection of comparable data on maritime activities in the European Atlantic area. This common database aims to solve most of the aforementioned problems of data homogeneity between countries, thus allowing the making of supranational analysis of the maritime economy not only on a nation-by-nation basis but also even at a more detailed regional level. It can thus provide the statistical foundation for different sorts of practical applications such as, for example, to assess the position of national maritime clusters in the wider context of the European Atlantic maritime economy (Fernández-Macho et al., 2015) or to construct a synthetic index to measure and compare the economic importance of the maritime sector in the European Atlantic regions (Fernández-Macho et al., 2016). However, there are still many gaps with respect to the spatial and sectoral coverage of the statistical information available.
The main purpose of the present paper is to assess the statistical coverage of the main maritime sectors (living and non-living resources, ship/boat building and maritime related construction, transportation, tourism, public administration, education and R&D) in order to help focus on the main activities that need to fill these data gaps. In this sense, a list of failed indicators is presented in terms of EU Atlantic countries with no data and the percentage of EU Atlantic regions with missing data in the rest of countries. To complete this, the paper also gives a list of indicators classified by maritime sectors and activities with information on the degree of territorial coverage of each indicator as measured by the percentage of EU Atlantic regions with data at each NUTS level.
Based on the latter, the paper evaluates and compares the level of importance of each maritime economic sector through the usual descriptive statistical measures. To perform this analysis based on a set of heterogeneous indicators (such as value added, turnover, persons employed, enterprises, passengers, landing tonnage, energy transmission, pipe length, etc.) with different units of measurement, an appropriate tool to aggregate and homogenize all available information is needed. For this purpose, crossefficiency Data Envelopment Analysis (DEA) is used as it is able to summarize all the countrywide information for each maritime indicator into one single score only with data-driven non-parametric flexible weights. In this respect, DEA scores are calculated at each of the main maritime sectors both for each indicator (in terms of its different regional values) and, conversely, for each NUTS3 region (in terms of its corresponding indicators). Finally, variation among maritime DEA scores is discussed with the help of a combination of density and box-and-whisker plots.
The paper is organized as follows. Section 2 gives a statistical assessment of the degree of territorial coverage of maritime socioeconomic indicators. Section 3 first discusses the criteria followed to select the indicators and then explains the statistical method used for the computation of single scores based on the above information. The ensuing results obtained for the relative importance of the European Atlantic maritime economy are shown and interpreted in Section 4. Finally, Section 5 summarizes the main conclusions and implications of these results.

STATISTICAL COVERAGE OF MARITIME ACTIVITIES
The Marine Atlantic Regions Network (Marnet) is made up of institutions and regional authorities dedicated to marine/maritime socioeconomic research in the five countries of the European Atlantic area (France, Ireland, Portugal, Spain and United Kingdom). It started as a collaborative project funded by the European Regional Development Fund (ERDF) and the Interreg Atlantic Area Programme 2007-2013 with the main objective of designing a methodology to build a database of maritime socioeconomic data that were comparable between countries and replicable using available data sources (Foley et al., 2014).
The European Atlantic (EUA) maritime database is built taking into account four aspects: i.-indicators of socioeconomic interest (chiefly employment and business variables, such as value added, turnover, enterprises, exports, costs, energy production, etc., but also physical data such as vessels, landing tonnage and value, hotel overnights, sports facilities, etc.); ii.-maritime and marine-related activities from the European Statistical Classification of Economic Activities (NACE) up to four-digit level (Eurostat, 2008); iii.-territorial coverage from the European Nomenclature of Statistical Territorial Units (NUTS), and iv.-time period (2008-2012 annual data).
In addition to this, the Marnet project classified maritime activities into three different groups in accordance to their relevance in the sector (Surís-Regueiro et al., 2013;Foley et al., 2014). Namely, Group 1 of fully maritime activities (i.e. marine fishing), Group 2 of mainly maritime activities (i.e. renting/leasing of water transport equipment) and Group 3 of partially maritime activities. The latter further divided later into two subgroups depending of the economic significance of the activity (i.e. hotels and similar accommodation vs. support activities for other mining and quarrying for instance) (Fernández-Macho et al., 2015). 1 As a sort of summary of the statistical coverage of European Atlantic maritime economic sectors Table 1a shows the initial distribution of maritime indicators: a total of 519 indicators of relative significance in economic terms, of which 202 are fully or mainly maritime.

Territorial Coverage
The Marnet project collected data at three territorial levels: i.-NUTS0: member state of the EU; ii.-NUTS2: basic regions for the application of EU regional policy, and iii.-NUTS3: small regions for specific diagnoses (e.g. 'départements' in France, provinces in Spain or, roughly, counties/councils in UK).
Appendix A shows the complete list of indicators classified by maritime sector, activity and group. The last five columns give the degree of territorial imputation of each indicator as measured by the percentage of EU Atlantic regions with data available at each NUTS level. However, when indicators are aggregated we note that no maritime activity is even close to moderate levels of territorial coverage. For example, Table 2b shows territorial aggregated coverage with NUTS3 and NUTS2 levels greater than 20% and 35% respectively. We note that at NUTS3 level no activity reaches the 35% aggregate coverage and even at NUTS2 level no activity surpasses the 50% coverage.

Failed Indicators
Appendix B shows the complete list of failed indicators classified by maritime sectors and activities with an indication of the EU Atlantic countries for which data are totally missing for that particular indicator. The last column gives the percentage of EU Atlantic regions with missing data in the rest of countries.
To summarize, Table 3 shows those maritime activities with indicators that have zero coverage at all NUTS levels. In particular, we note, for their significance, that fully/mainly maritime activities such as Sea/coastal passenger water transport, Extraction of crude petroleum and gas and Renting/leasing of water transport equipment have relevant indicators with zero coverage.
Tables 4a and 4b show the number of failed indicators by country. For each country in the European Atlantic area the figures correspond to the (nonexclusive/exclusive) number of maritime indicators with no data in that country (columns) distributed by percent failures in non-failed countries (rows). For example, out of a total of 519 indicators, Ireland fails to record any data for more than 60% of them (322), of which 44% (227) correspond to indicators that some other country does not fail and the rest (27+68) correspond to indicators that some other countries fail with an indication of their degree of failure (either up to 50% or greater than 50%). More specifically, Ireland fails to record 7% (37) of indicators exclusively, i.e. that no other country fails, while Portugal does not fail any indicator that is not also failed by some other country. Table 4c gives similar information by number of failed countries. That is, the figures shown are the number of maritime indicators with no data in some countries distributed by number of failed countries (columns) and percent failures in the rest of countries (rows). For example, whilst there are 61 indicators (12%) that are failed by all countries and 181 indicators (35%) that are failed by any four countries, we have that 167 (32%) of the latter correspond to indicators that are completely present (not failed) in the remaining country. In fact, 76% of the failures correspond to indicators that are failed by a number of countries but completely present in the others. This means that the EUA maritime database can be completed in the future on an indicator-by-indicator basis by letting countries that have been unable to record some data focus on those that have been able to complete the corresponding indicators. Nevertheless, we note that, in total, there are still 434 indicators in the database that are failed by at least one country. The remainder are present in all countries and can then be used for statistical analysis and comparison purposes in what follows.

Statistical Information Processing
To begin with, the actual values used for the indicators correspond to the latest year available in the database for each one of the p = 87 European Atlantic area NUTS3 coastal regions. On the other hand, whenever no value is available at NUTS3 level within the past three years, the imputed value corresponds to the corresponding NUTS2 (or, alternatively, NUTS0) area. However, as discussed previously, not all maritime data are available for some countries so that there are still indicators with missing data after the imputation. Table 4b shows the final distribution by class/group and sector of maritime indicators available for all regions after imputation and, therefore, actually used in the construction of the scores. 2 In total, we have that n = 85 indicators were finally available for all the European Atlantic area NUTS3 regions, of which 47 are fully maritime.

DEA Scores Computation
In short, the objective of the proposed statistical method consists, for each of the n maritime indicators (cases), in reducing the p values obtained from the different NUTS3 regions in the European Atlantic area to a single score. For this purpose an appropriate set of weights must be selected in order to calculate the intended score. Usually, a simple index uses a fixed set of weights chosen by the analyst for all the cases involved, e.g. a weighted average of the p regional values. However, it is not clear how much of the scores obtained are then due to the 'chosen' weights instead of the actual observations or even whether the 'chosen' set may favor some cases against others. In contrast, Data Envelopment Analysis (DEA) is a linear programming technique that obtains flexible weights directly from the data (Charnes et al., 1978;Banker et al., 1984). DEA tries to find for each case a set of specific weights such that a weighted sum of values is maximized with the restriction that none of the cases receives a score greater than unity (for a recent review of DEA methods see Lovell and Pastor, 1999Liu et al. 2011, Yang et al. 2014and Cook and Seiford 2009). More specifically, for each case or indicator k with values zjk (j=1,...,p), DEA maximizes Fig. 1 (see a larger format version in Supplemental Material) shows a typical example of assigning scores by DEA. As a rule, cases in the efficiency frontier are given a value of 1, while the scores assigned to inefficient cases correspond to their radial distance to the efficiency frontier. DEA scores can thus be thought as the result of a self-evaluation relative to the efficiency frontier using flexible weights that are consistent with own particular performance. However, in order to obtain a more balanced view for comparison purposes, we may also want to incorporate peer evaluation into the value judgment. That is, individual cases may not only be assessed by their own weights but also by the weights chosen by any other case that represents a different feasible scenario in the system (Sexton et al., 1986;Doyle and Green, 1994). Let k S~(k) be the maximum self-evaluation score obtained by case k and let k w~(k), j=1,...,p, be the set of optimal weights for such case. According to them the rest of cases will obtain the k-th peer evaluation scores of and this will be repeated for each k=1,...,n. That is, at the end of the process each case will receive a total of n values that can be written into the rows of an n×n cross-efficiency matrix S =( l S~(k)) (Adler et al., 2002;Markovits-Somogyi, 2011).
Finally, the score for the ℓ-th case is obtained as the geometric mean of all the n self and peer evaluation scores, that is: where it is clear that 0 ≤ S(ℓ) < 1.

MARITIME ECONOMY PERFORMANCE
The scores obtained for the indicators in the EUA maritime database can be interpreted as a measure of the relative importance of each maritime socioeconomic sector in the overall EU Atlantic maritime economy.

Sectoral DEA Scores Variation
Figure 2 (see a larger format version in Supplemental Material) shows some descriptive features of the distribution of sectoral DEA scores with the aid of so called violin plots. A violin plot is a combination of rotated kernel density and box-and-whisker plots that helps to describe the most salient features of the distribution of a variable. The figure shows that sectors tend to have a large positive skew which implies that most indicators are concentrated in a group with low values of relative importance, although there are a few outliers with higher performance. Namely, we have non-life insurance/reinsurance gross premiums written (65.12-65.20, Transportation) and turnover for processing/preserving of seafood (10.20, Living resources), repair/maintenance of ships/boats (33.15, Ship/boat building) and restaurants and food services (56.10, Tourism and recreation) which achieves the highest score of almost 100% cross-efficiency. The figure also shows that the Tourism and recreation sector appears the most accomplished maritime sector in all with three activities with indicators obtaining highest scores. Namely, turnover and gross value added of 56.10: restaurants and food services, 56.30: beverage serving activities and 55.10: hotel accommodation. In comparison, all the other maritime activities achieve much lower scores, with Living resources in second place and Transportation in third. Some other relevant cases can be seen in the figure.

Regional DEA Scores Variation
The role of cases and values can also be reversed to obtain a geographical interpretation of the scores. That is, DEA scores are now calculated for each of the NUTS3 EU Atlantic regions (cases) from a common set of indicator values at different maritime activities. Table 5 shows country average percent DEA scores for each maritime sector. The table shows the relative importance that maritime activities have in the economy of EU Atlantic regions. For example, we note that Spanish and Portuguese regions score highest on average in most maritime sectors. On the other hand, France obtains the highest average score in Ship/boat building, Ireland and UK do the same in Transportation whilst UK scores high in Tourism and recreation also. Figure 3 (see a larger format version in Supplemental Material) shows violin plots of the regional variation of each maritime sector using the individual scores obtained by the different EU Atlantic NUTS3 regions. Except for Transportation and Tourism the sectors have a positive skew, which implies that most regions are concentrated in a group with low values of relative importance but there exist a few outliers with higher performance. Namely, this is the case of ES111 = A Coruña (Galicia) for Living resources, PT171 = Lisboa for Non-living resources and Education and R&D, UKK30 = Cornwall/Scilly for Ship/boat building, and PT112 = Cávado (Norte) for Construction. In the case of Transportation there is a clear bimodal distribution with a group of higher values made up of regions from Spain, Ireland and UK, and a lower group made up of regions from Portugal and France, whilst the Tourism and recreation sector is much more homogeneous than the other sectors with UKK30 = Cornwall/Scilly scoring highest. Some other relevant cases can be seen directly in the plot.

CONCLUSIONS
Seas and oceans are now considered drivers for economic development with great potential for innovation and growth. Hence, monitoring of maritime socioeconomic sectors has become a crucial aspect of the policy making process which needs empirical support to provide basic data.
In this sense, the Marine Atlantic Regions Network (Marnet) project was setup to develop a maritime socioeconomic database with a common methodology for the collection of comparable data on maritime activities in the European Atlantic area. However, there are still many gaps with respect to the spatial and sectoral coverage of the statistical information available. In order to help focus on the main activities that need to fill these data gaps this paper has presented a statistical assessment of the data coverage offered for the different maritime economic sectors.
Regarding the degree of territorial coverage of the EUA maritime database, the paper presents for each indicator the percentage of EU Atlantic regions with data at each territorial NUTS level. In this respect, we can see that there are many indicators, usually related to maritime activities in the Living resources and Tourism sectors, with a high territorial coverage even at the smallest NUTS3 regional level. However, when territorial coverage is aggregated by activities we also note that no maritime activity is even close to moderate levels of coverage. For example, at the NUTS3 level none of the maritime activities reaches the 35% coverage and even at NUTS2 level no activity surpasses the 50% coverage.
As a consequence, a list of failed indicators classified by maritime sectors and activities was also prepared with an indication of the EU Atlantic countries for which data on that particular indicator are totally missing. In particular, we note that economically significant fully/mainly maritime activities such as Sea/coastal passenger water transport, Extraction of crude petroleum and gas and Renting/leasing of water transport equipment have relevant indicators with zero coverage.
A crude reading of the list would indicate that there are only 85 common indicators (17%) that are present in all countries and, consequently, that can be used for statistical analysis and comparison purposes. On the other hand, we also note that in fact there are just 61 indicators (12%) that are failed by all countries, which may indicate that they are the ones that are difficult to obtain. However, the vast majority of failures (76%) correspond to indicators that are failed by a number of countries but completely present in the others. This means that the EUA maritime database can be fulfilled in the future on an indicator-by-indicator basis by focusing in those countries that have been able to complete the corresponding socioeconomic indicators.
In order to evaluate and compare the relative importance of each maritime sector a DEA based statistical method is used to summarize all the countrywide information for each maritime indicator into one single score. In this manner the paper first evaluated and compared the relative performance of the maritime sectors. It appears that the Tourism and recreation is the most accomplished maritime sector of all with three activities (restaurants and food services, beverage serving activities and hotel accommodation) with indicators (turnover and gross value added) obtaining highest scores. In comparison, all the other maritime activities achieve much lower scores, with Living resources in second place and Transportation in third.
When DEA scores are calculated for the NUTS3 EU Atlantic regions in terms of their indicator values, variation among EU Atlantic regions can also be evaluated in terms of the relative importance of their maritime activities. It turns out that Spanish and Portuguese regions score highest on average in most maritime sectors with France scoring high in Ship/boat building, Ireland in Transportation and UK in Transportation and Tourism and recreation.
Finally, the geographical distribution of each maritime sector using the individual regional scores shows that most regions are concentrated in groups with low values of relative importance with a few outliers of higher performance in all maritime sectors except Transportation and Tourism. In the case of Transportation there are two differentiated groups: Ireland, Spain and UK regions, for which this sector is relatively important, and Portugal and France, regions where the sector is of a lesser importance, whilst the Tourism and recreation sector shows the greatest homogeneity in terms of regional variation.
All these analyses and comparisons show the clear influence of the maritime activities on the EU Atlantic regions and may offer novel insights into their impact on the European Atlantic economy.