A Global Classi ﬁ cation of Astronomical Tide Asymmetry and Periodicity Using Statistical and Cluster Analysis

Tidal asymmetry plays a key role in the behavior of estuaries. Positive and negative asymmetries are associated with ﬂ ood ‐ and ebb ‐ dominated estuaries, respectively. Asymmetry arises from both the interaction among the main tidal constituents and the harmonics generated when tide propagates in shallow waters. Most previous research focuses on the deformation of the tide within estuaries; however, ocean tide may show asymmetry at the estuary entrance, which implies that the boundary condition is already deformed. This fact has important implications for tide propagation, estuarine transport processes, and ﬂ ow exchanges between estuaries and open oceans. In this study, the global astronomical tide is classi ﬁ ed according to its asymmetry and periodicity. The objective is to provide a guiding framework of representative astronomical tide types ( ATtypes ) on a worldwide scale to be used as a reference for further research on the transport of substances in estuaries. The applied methodology is based on the use of the TPXO9 ‐ atlas global barotropic tidal solution and detailed statistical analysis. Probability density functions of the tidal elevation time derivative and the tidal form factor were extracted from TPXO9 ‐ atlas with a spatial resolution ranging from 1/6° to 1/30°. The K ‐ means algorithm was applied to these parameters, and 25 representative ATtypes were identi ﬁ ed. The classi ﬁ cation was validated with 757 worldwide tide gauge records. The results show that 11.3% of coastal areas show negative asymmetries, 11.3% positive asymmetries, while symmetric tides dominate 77.4% of coastal areas. In these areas, estuaries


Introduction
The asymmetry of the tide plays a key role in the behavior of estuaries (Aubrey & Speer, 1985;Brown & Davies, 2007;Dronkers, 1986;Friedrichs & Aubrey, 1988;Hoitink et al., 2003;Nidzieko, 2010;Zhou et al., 2018).Tidal asymmetry refers to the distortion of tidal wave caused by the flood phase occurring faster than the ebb phase or vice versa.When the flood phase is shorter than the ebb phase, the flood currents are more intense than the ebb currents, and net transport is induced in the estuary; when the ebb phase is shorter than the flood phase, the estuary tends to export substances.This relationship has been traditionally used to explain sediment transport but can be extended to understand the transport of any other substance (e.g., marine litter and nutrients) or any other process dependent on estuarine hydrodynamics.Tidal asymmetry arises from both the interaction of the main tidal constituents in open oceans and the harmonics generated when tide propagates in shallow-water areas (Hoitink et al., 2003;Nidzieko, 2010).Nidzieko (2010) highlighted the importance of the asymmetry imposed by the main tides in the mouth of an estuary since the asymmetry imposed in the mouth is overcome first, although the estuary morphology is significant in determining the dominance of the flood phase or ebb phase in these systems.Hence, the fact that ocean tide shows asymmetry at the entrance of an estuary has significant implications for tide propagation and, consequently, in some of the most widely recognized estuarine processes.Some examples of estuarine issues that can be affected by the presence of an external asymmetry can be point out, namely, marine litter transport and distribution (Hinojosa & Thiel, 2009;Mazarrasa et al., 2019;Núñez et al., 2019), sediment transport and morphodynamics (Boon III & Byrne, 1981;Brown & Davies, 2007;Dronkers, 1986;Friedrichs & Aubrey, 1994, 1988;Gallo & Vinzon, 2005;Ranasinghe & Pattiaratchi, 2000), turbidity (Gallo & Vinzon, 2005), or flushing times (Monsen et al., 2002).Therefore, a detailed analysis of this asymmetry constitutes an important starting point to study several processes on an estuarine scale.
Asymmetry has been traditionally defined by two parameters: the ratio between the amplitudes of two tidal constituents (e.g., a M4 /a M2 ), employed to quantify the degree of distortion, and the difference between two phases (e.g., 2ϕ M2 − ϕ M4 ), used to identify the orientation (e.g., Aubrey & Friedrichs, 1988).This harmonic parametrization enables the asymmetry derived mainly from two constituents to be described.Other studies have used probability density functions (PDFs) of tidal elevation since PDFs retain more information and hence are more suitable for characterizing long-term tidal distributions (Castanedo et al., 2007;Woodworth et al., 2005).Following this research path, Nidzieko (2010) recommended the use of a parameter related to the PDF shape, the skewness coefficient (γ 1 ), and applied this coefficient to the tidal elevation time derivative, that is, a representative parameter of rising/falling tidal speed and, consequently, of flood/ebb currents in shore tidal estuaries.Nidzieko (2010) showed that for two tidal constituents, γ 1 kept the information of the traditional metric while maintaining the advantage of being applicable to any time series.Song et al. (2011) generalized this approach for any number of tidal constituents, and Nidzieko and Ralston (2012) used a variant of the skewness approach to consider asymmetry variations with changes over less than a fortnight.In summary, asymmetry can be quantified by using harmonic or statistical methods, both effective and with complementary advantages.Guo et al. (2019) found that harmonic methods are suitable for studying areas characterized by diurnal and semidiurnal regimes, while the application of statistical methods is relevant in mixed tidal regimes.
Most of the existing studies on asymmetry are based on analytical or empirical analyses of the deformation generated in a tide when it propagates through an estuary (Aubrey & Speer, 1985;Blanton et al., 2002;Brown & Davies, 2007;Byun & Cho, 2006;Dronkers, 1986;Friedrichs & Aubrey, 1988;Hoitink et al., 2003;Nidzieko, 2010;Nidzieko & Ralston, 2012;Ranasinghe & Pattiaratchi, 2000).However, the direct effect of asymmetry originating in the open ocean as a boundary condition for the transport of substances in estuaries has not yet been studied in depth.Hoitink et al. (2003) performed a theoretical study and provided an inventory of tidal constituents that contribute to asymmetry, including not only overtides and compound tides but also main tidal constituents.Song et al. (2011) developed a global approach for investigating asymmetry in the open ocean using the TPXO7-atlas solution with a spatial resolution of 0.25°and identified the constituents responsible for asymmetry worldwide based on the skewness parameter.They found a strong correlation between asymmetries due to different combinations of the main tidal constituents and tidal regimes (semidiurnal, mixed, and diurnal).resolution cannot accurately simulate the tide in coastal areas.The TPXO9-atlas data set has an improved tidal accuracy compared to its previous versions (Jeon et al., 2019), and its spatial resolution (1/30°) allows the tide to be better characterized in coastal areas.Moreover, as a novelty, astronomical tide has been classified worldwide according to the most complete information provided by PDFs (defined by both skewness and kurtosis coefficients) of the tidal elevation time derivative.As the only hydrodynamic driver analyzed in this study is the astronomical tide before its propagation through any estuary and without considering other phenomena, the tidal elevation time derivative variable, representative of rising/falling tidel speed, can be associated to tidal currents in estuarine environments and, consequently, to the transport of substances (Hoitink et al., 2003).Despite the availability of "barotropic tidal currents" from TPXO9-atlas, the "tidal elevation time derivative" was selected as the analysis variable.Indeed, sea level data are available from tide gauges, whereas tidal current records are scarce.Therefore, hereinafter, references to flood/ebb tidal currents refer exclusively to currents due to temporal gradients in the tidal height in shore tidal estuaries.In addition, tidal periodicity was included in the analysis due to its influence on long-term transport in estuaries (Lesser, 2009).The K-means algorithm (Hastie et al., 2001) was applied to these tidal parameters to cluster astronomical tides and obtain representative types (hereinafter ATtypes) worldwide.The results were validated with data from 757 worldwide tide gauges selected from the GESLA-2 data set (Woodworth et al., 2017) to cover the world's coastal areas.Furthermore, the main tidal constituents that influences the asymmetric character of each ATtype have been identified based on the methodology proposed by Song et al. (2011).This aspect allows tidal studies to be addressed, not only from a statistical approach but also using harmonic methods, which provides additional value to the graphic guide generated herein.
The remainder of this paper is structured as follows: Section 2 describes the data sets, tools, and methods used to analyze global tide asymmetry, section 3 describes the validation process, section 4 discusses the results, and section 5 draws the main conclusions of the study.

Data for Classification and Validation
Hourly time series of astronomical tide elevations were computed from harmonic constituents provided by the TPXO global and local tidal models developed by the Oregon State University (OSU) (Egbert & Erofeeva, 2002).These models represent an optimal least squares fit of the Laplace tidal equation to satellite altimetry data.In this study, TPXO9-atlas solution was used; this data set combines the TPXO9.v1global solution (dx = 1/6°) with local solutions in coastal areas, including the Arctic and Antarctic.TPXO9-atlas provides the M2, S2, N2, K2, K1, O1, P1, Q1, M4, MS4, MN4, and 2N2 harmonic constituents with a resolution of 1/30°and the Mm, Mf, and S1 constituents with a resolution of 1/6°(for more details, see https://www.tpxo.net/global/tpxo9-atlas).
A selected set of tide gauge records from the GESLA-2 (Global Extreme Sea Level Analysis Version 2) data set (www.gesla.org)was used to validate the clustering results.GESLA-2 is a quasi-global data set comprising high-frequency (hourly or more frequent) sea-level records from tide gauges distributed worldwide.Contributions to GESLA-2 come from 30 sources, 27 of which are considered public, and three are private.The main contribution comes from the University of Hawaii Sea Level Center, which provides more than one-quarter of the total station years.A total of 39,151 station years are available from 1,355 records; that is, an average of 29 years is available per record.However, the number of years varies between 1 year for many of the records and over 160 years in places such as Brest (France).The quality of this data set has been controlled by each provider and has been demonstrated in its applications in previous studies (e.g., Hunter et al., 2017;Menéndez & Woodworth, 2010;Wahl et al., 2017).For a detailed description of this data set, refer to Woodworth et al. (2017).

Tidal Statistics
Probability density functions (PDFs) of the tidal elevation time derivative (dη/dt) were generated from tidal elevation time series, obtained from TPXO9-atlas, to characterize astronomical tide worldwide (Figure 1a).A 19-year period was selected to build these dη/dt time series and investigate the modulation of a nodal cycle.The highest spatial resolution of 1/30°offered by TPXO9-atlas models was used to represent the high variability induced by coastlines.However, a resolution of 1/6°was considered suitable for describing the tide in deep oceans, where there is no difference between 1/6°resolution TPXO9 base global solution and 1/30°1 TPXO9-atlas.Twenty bins were found to be adequate for representing the PDFs of astronomical tides worldwide.
PDFs were standardized by the maximum magnitude of the tidal elevation time derivative (|dη/dt| max ) to find globally representative ATtypes.The maximum tidal elevation and therefore the rising/falling tidal speeds show large variations in different areas of the world.Figure 1b shows the worldwide distribution of |dη/dt| max and identifies some of the areas with the highest rising/falling speeds.In general, tides in the open ocean are microtidal with small rising/falling speeds, while both the tidal range and the tidal speed can be significantly higher in coastal areas.Tidal effects can be regionally amplified, and the tidal range exceeds 8 m in specific areas of the world (Chan & Archer, 2003); the five regions with maximum tidal speed framed in Figure 1b coincide with some of these areas.The maximum tidal speed reaches 5 m/s in the English Chanel (Dauvin, 2012), 3.5 m/s in the Hudson Strait (NGA 2017), 3 m/s in Fundy Bay (Karsten et al., 2008), 2.4 m/s in Isla Grande de Tierra de Fuego (Bujalesky, 1997), and 1.5 m/s along the NW coast of Australia (Porter-Smith et al., 2004).
To avoid potential errors derived from the discretization in bins, the PDFs were not directly clustered; instead, two statistical parameters, namely, the skewness (γ 1 ) and kurtosis (g 2 ), defining the PDF shape were used.γ 1 , calculated as the normalized sample skewness of the tidal elevation time derivative (dη/dt = η′) (Equation 1), describes how one tail of a PDF is distributed over the other; that is, it is a measure of the distribution asymmetry: where μ 3 is the third sample moment about the mean, σ is the standard deviation, the square-root of the second sample moment about the mean, and τ is the number of data in the time series (hours in 19 years).This parameter was proposed by Nidzieko (2010) and was then used by Song et al. (2011) and adapted by Nidzieko and Ralston (2012) to characterize tides.If γ 1 > 0, the distribution is positively asymmetric, which implies that tidal falling speeds are more frequent than rising speeds; consequently, in shore tidal estuaries, the flood phases are shorter than the ebb phases, and flood currents are more intense (Figure 2a).Estuaries with this type of overall tidal asymmetry imposed on their mouths, where the tide is the key factor, can generally be associated with an accretive trend in such areas.It should be noted that sediment import or export depends not only on the tide but also on many other factors related to hydrodynamics (e.g., presence and type of a river discharges, temperature and salinity conditions, or currents induced by waves) and also related with sedimentary features (Ridderinkhof et al., 2000).If γ 1 < 0, the distribution is negatively asymmetric and is characterized by more intense speeds during the tidal fall than during the rise (Figure 2c).Tidal estuaries subject to these asymmetries commonly show a regressive trend in their mouths.It is worth noting that γ 1 can vary (by increasing, decreasing, or even changing its sign) when tide propagates in an estuary depending on the estuarine morphology and other processes, such as river contributions.Consequently, an estuary can present, for example, ebb dominance in its mouth and flood dominance inside (Nidzieko, 2010) or vice versa.However, γ 1 does not provide all the information contained in a PDF.Three PDFs with positive asymmetry (γ 1 = 0.8), three symmetric PDFs (γ 1 = 0) and three PDFs with negative asymmetry (γ 1 = −0.8)are shown in Figure 2.This figure illustrates how PDFs can take different shapes yet have equal skewness.The missing information is provided with the kurtosis coefficient.The convention adopted in this study is known as excess kurtosis (g 2 ) corresponding to Equation 2: where μ 4 is the fourth sample moment about the mean.Traditionally, there has been controversy in the interpretation of g 2 as a measure of peakedness.However, according to Balanda and MacGillivray (1988) and Westfall (2014), g 2 measures the concentration of data outside one standard deviation of the mean in either tail of the distribution; that is, this parameter describes how the data are distributed in the areas between the peak and tails (in the areas known as shoulders) without saying anything about the peak.
Figure 2 shows how, regardless of the γ 1 value, the concentration of data in the shoulders moves towards the tails and increases its frequency with a decrease in g 2 .Therefore, this parameter allows distributions with equal γ 1 to be compared and refers to the frequencies of the higher rising/falling speeds of distributions with equal γ 1 ; that is, g 2 allows to associate tides with analogous asymmetries and similar maximum tidal currents to different transport capacities in estuarine environments.
Figures 3a and 3b show the global distributions of γ 1 and g 2 , respectively, where different tidal patterns can be identified according to the PDF shape.The largest γ 1 , both positive and negative, are in very specific areas of the world that are relatively adjacent (warm-colored areas for γ 1 > 0 and cold-colored areas for γ 1 < 0 in Figure 3a).It is worth noting that the most asymmetric tides are concentrated in the northern and southern areas of the Pacific Ocean, the area between the Gulf of Mexico and the Caribbean Sea, the coastal areas of the United Kingdom, the North Sea, the Mediterranean Sea, the Arabian Sea, the SW coastline of Australia, the China Seas, and the Japan Sea.As shown in Figure 3b, g 2 adopts average values of approximately −0.5 in these areas.In contrast, γ 1 close to zero dominate the central area of the Pacific Ocean, the Atlantic Ocean, and the central area of the Indian Ocean, which coincide with the areas where g 2 adopts its relatively low values (g 2 < −0.8).The highest values of g 2 are around 0 and reaching 0.8 in some specific areas of the world, such as the Patagonian Shelf.Such g 2 values are located in the proximity of the tidal amphidromic points where tidal amplitudes and, therefore, tidal speeds are null.Figure 3c shows a scatter plot of γ 1 -g 2 and the frequencies of different types of PDFs around the world according to these parameters.Some considerations can be drawn from this figure.On the one hand, both parameters show well-defined limits, namely, −0.9 and 0.9 for γ 1 and −1.5 and 0.8 for g 2 .On the other hand, symmetric distributions (|γ 1 | < 0.1) can show kurtosis within the entire available range [−1.5, 0.8], indicating wide variations in the frequencies of dη/dt, that is, from high frequencies for g 2 of approximately −1.5 to low frequencies for g 2 near 0.8.However, as the asymmetry of a distribution increases (|γ 1 | > 0.1), the range of g 2 narrows around higher values [−0.2, 0.8].Moreover, the most frequent tidal distributions in the world are symmetric (|γ 1 | < 0.1) and have small kurtosis (γ 1 < −0.8), whereas this frequency decreases with increasing |γ 1 | and g 2 .
To achieve a more complete description of tides worldwide, the tidal regime (TR), that is, its periodicity, was additionally included in this study.TR is calculated through the tidal form factor (F), defined as the ratio of the amplitudes of the O1 and K1 constituents to those of the M2 and S2 constituents (Defant, 1961) (Equation 3): F lower than 0.25 defines semidiurnal tidal regimes (S); F between 0.25 and 1.5 characterizes mixed semidiurnal tides (MS); F between 1.5 and 3 is representative of mixed diurnal tides (MD), and F higher than 3 reflects diurnal tidal characters (D). Figure 4 shows the worldwide distributions of the tidal form factor (F) and their associated tidal regimes (TRs).From a comparison between Figures 3 and 4, the existence of a relationship between F and γ 1 -g 2 can be detected.Previous studies, such as those conducted by Blanton et al. (2002); Byun and Cho (2006); Hoitink et al. (2003), revealed asymmetries arising from some particular combinations of tidal constituents with specific tidal regimes, and Song et al. (2011) found a relationship between the tidal asymmetry (γ 1 ) generated in the open ocean and F.

Clustering
The K-means algorithm was applied to γ 1 , g 2 , and TR to classify the tides.Due to the relationship among these three variables, joint clustering was performed.The K-means algorithm divides the starting data set into a given number of subsets.Each subset is represented by a centroid or prototype and consists of the data that are best represented by this prototype (Hastie et al., 2001).In this study, K-means was initialized with the maximum dissimilarity algorithm (Camus et al., 2011), and the parameters to be clustered were weighted according to the area that each triplet represents, (1/30°) 2 or (1/6°) 2 .Once the centroids were found, the closest γ 1 , g 2 , and TR values of the original data set and their associated standardized PDF were selected.To find the optimum number of clusters (N) that best describes the ATtypes worldwide, K-means was preliminarily tested for different N values ranging between 2 and 98.A comparison between the original tide series and the synthetic time series reconstructed with the clusters derived from these classifications was performed using the efficiency coefficient (CE).CE was selected because it is the best objective function for reflecting the overall adjustment of a clustering output (Bárcena et al., 2015).CE is a normalized statistic developed by Nash and Sutcliffe (1970) that determines the relative magnitude of the residual variance (noise) and therefore reflects the accuracy with which the centroids estimate the original data (Equation 4): where O i is the i-data (γ 1 , g 2 , or TR) of the original series, C i is the i-data (γ 1 , g 2 , or TR) of the representative centroid and n is the number of nodes where the tide is evaluated.CE ranges between −∞ and 1.0.The level of agreement between two series is valued as excellent if CE is higher than 0.8, convenient if CE ranges between 0.6 and 0.8, poor if CE is lower than 0.5 and unacceptable if CE is lower than 0. Figure 5a shows the scatter plots of γ 1 -g 2 associated with each TR from TPXO9-atlas (color dots) and from the clustering (dots inside the black circles) for N equal to 9, 16, 25, and 36.An increasing trend is observed for the original representativeness with the number of clusters.However, in the case of 36 clusters, centroids begin to be similar for the most frequent distributions (symmetric with small kurtosis), reducing the improvement in representativeness.N equal to 25 was determined to be adequate because CE exceeds 0.8 for the three analyzed parameters (γ 1 , g 2 , TR) and because the improvements are no longer significant for higher N values (Figure 5b).

ATtypes and contribution of combinations of tidal constituents
Figure 6a shows the centroids (C i ) from clustering sorted from the highest to the lowest value of γ 1 .Each C i is defined in terms of the shape of its PDF of dη/dt (i.e., its γ 1 and g 2 coefficients) and its tidal regime (S, MS, MD, and D) and represents an astronomical tide type (ATtype).Figure 6b shows the geographical distribution of the 25 identified ATtypes.
The  6b).The estuaries located in these coastal areas display flood dominance in their mouth, but this dominance can be affected or modified as a result of the estuarine morphology as the tide propagates into them.On the other hand, negative asymmetries (from C20 to C25) cover 8.2% of the open ocean and 11.3% of all coastal areas.The southern Caribbean Sea, the north coast of Greenland, the South China Sea, and the west coast of Australia are clear examples of areas with negative tidal asymmetries, and the estuaries hosted within these areas exhibit ebb dominance in their outermost zone (cold-colored areas in Figure 6b).
Moreover, the main tidal constituents that contribute to the total asymmetry in each ATtype were identified.
To this end, the method proposed by Song et al. (2011) was applied to to 25 clusters, derived from TPXO9atlas.At each of these clusters, the contributions (β i ) to γ 1 of the different combinations (pairs or triplets) of tidal constituents that can influence the asymmetry were calculated, that is, the constituents for which the following frequency ratios 2ω 1 = ω 2 or ω 1 + ω 2 = ω 3 are met.Finally, the subset of tidal constituents that best explains the γ 1 -g 2 pair was selected from those constituents.
Figure 7a shows the pairs and triplets of tidal constituents that contribute to tidal asymmetry as well as the value of each contribution (β i ), expressed as a percentage, to each ATtype (C i ).Each C i on the y-axis is represented by a color associated with a tidal regime, namely, blue for S, green for MS, yellow for MD, and red for D. It is worth noting that there are combinations that contribute with the same orientation as the characteristic skewness (γ 1 ) of each C i (warm-colored cells in Figure 7a), while others counteract γ 1 (cold-colored cells in Figure 7a).The sum of all contributions results in the representative γ 1 .As the figure shows, the symmetric tides (|γ 1 | < 0.1) have a greater number of combinations that contribute a similar percentage, although with different orientations, to the resulting skewness and that all contributions are balanced between them, thereby canceling out γ 1 .Conversely, tides with a significant asymmetrical component (|γ 1 | > 0.1) show a clear and large contribution.In general, the astronomical triplet O1/K1/M2 dominates the D, MD, and MS regimes, while the P1/K1/S2 and M2/M4 combinations play a secondary or tertiary role in the asymmetry.In the S regime, the main contribution comes from M2/M4.Among the set of analyzed combinations, the subset of constituents that best explain the γ 1 -g 2 pair was selected (framed in gray).This analysis indicates that a relatively small number of tidal constituents plays a role in the definition of γ 1 ; however, a greater number is necessary to characterize g 2 .
Figure 7b shows the good agreement between the total skewness and excess kurtosis resulting from the classification (γ 1 -g 2 ) and the same statistics obtained from the main tidal constituents (γ 1 -g 2 ), with R 2 exceeding 0.9 in both cases.
Figure 8 shows the global distributions of the contributions (β i ) of the main tidal constituents responsible for the asymmetry of each of the identified ATtypes.

Validation
The classification results were validated with 757 selected tide gauge records from GESLA-2.To ensure the representativeness of the external tide, one record was selected per tide gauge located at a distance of less than 10 km from the nodes where the clustering outputs are available.On the one hand, the PDFs of the dη/dt product of clustering were compared with those obtained from the tide gauge records by applying the CE statistic (Equation 4).Since PDFs depend on the analyzed time window and tide gauge records cover different time periods, harmonic analysis of each record was performed to then reconstruct the dη/dt series over the same 19-year period.On the other hand, the degree of matching between the tidal regimes from both sources was evaluated.Regarding the comparison among the PDFs, Figure 9 shows the degree to which the outputs from clustering represent the tidal conditions recorded at the tide gauges.A total of 684 tide are represented in blue, which is associated with an excellent characterization (CE >0.8); 54 tide gauges are shown in green, indicative of good agreement (0.6 <CE≼0.8);seven tide gauges are shown in yellow, which reflects poor representativeness (0.5<CE≼0.6); 14 tide gauges show acceptable but worse than poor agreement (0<CE≼0.5)and are represented in orange; and finally, one tide gauge was represented in red, indicating unacceptable representativeness (CE≼0).Hence, 97% of the clustered PDFs characterize the PDFs of the world's coastal areas well or very well, and only 3% exhibit poor or unacceptable representativeness.As an example of the classification ability to characterize tides even in complex tidal areas, Figure 9b shows the Adriatic Sea.In this region, where up to 10 different ATtypes with γ 1 ranging from −0.36 to 0.32 are observed, the validation verifies the classification results with an excellent degree.
The main reason for the disagreement between the clustered PDFs and those obtained from the tide gauge records is not related to the type of PDF but rather to the geometric complexity of the area surrounding the represented node.Figure 9 shows the classification represents different types of distributions with a high level of agreement, such as symmetric bimodal distributions (e.g., c3 CE = 0.93 and e2 CE = 0.99), asymmetric bimodal distributions (e.g., d4 CE = 0.87) and asymmetric unimodal distributions (e.g., f2 CE = 0.98).Concerning this issue, areas c, d, e, and f in Figure 9 represent the main problem areas with regard to the representativeness.Area c corresponds to the East Coast of the United States ranging from Chesapeake Bay to the vicinity of Portland (Maine, USA).This area contains a unique tide gauge that the clustering classifies with an unacceptable level (c1 CE = 0).The reason for this result is the location of the tide gauge in a shallow and narrow area on the NW coast of Nantucket Island (USA), where different types of tides converge (C5, C13, and C23 centroids with γ 1 values equal to 0.32, 0, and −0.36, respectively).Area d corresponds to the coastlines of Ireland and the United Kingdom, where tide gauges with low levels of concordance are observed (e.g., d1 CE = 0.39 and d2 -CE = 0.50).Due to the complex geometry of the Northern Europe coastlines (Irish Sea, the English Channel, etc.) and the effect of this geometry on the tides, almost half (45.5%) of the tide gauges located in these areas show CE values lower than 0.6.In area e, located along the northern Brazilian coast, two tide gauges are inadequately characterized due to their locations in very shallow-water areas that local TPXO models fail to describe (e.g., e1 CE = 0.23).In area "f" (Japan), there are a few points with a low classification representativeness since they are embedded in narrow areas (e.g., f1 CE = 0.31).Despite all of the abovementioned factors, it can be seen in Figure 9 that even in these problematic areas, the number of nodes with a good or excellent level of agreement predominates over the number of nodes with poor or worse concordance.
Figure 10 shows the tidal regimes (TRs: S in blue, MS in green, MD in yellow, and D in red) obtained from the selected tide gauges (outer ring) and those according to the performed classification (inner circle).In addition, the mismatched points are highlighted with a violet ring.The TRs coincide in 87% of the locations compared, while there is one regime difference in the remaining 13%.The main areas in which these mismatches are concentrated are the Gulf of Mexico and the East Coast of the United States (area a1 in Figure 10), and the west coast of Japan (area "a2").Figure 10b shows a scatter plot between the tidal form factors (F) obtained from the tide gauge records and those obtained from TPXO9-atlas, where the mismatched points are highlighted with a red circle.This figure verifies that most of the points where the clustered TRs do not coincide with the recorded TRs are in the transition areas between two regimes.
As mentioned in section 1 (Introduction), the dη/dt variable was chosen for this study instead of the tidal current (u) because of its validation possibilities and since long tidal current records for the entire globe are scarce.To check the similarity between these two variables in coastal areas, the γ 1 and g 2 values of the 19-year time series of both the dη/dt and u in the average direction of tidal propagation (both variables reconstructed from TPXO9-atlas) were compared at the validation points.Figure 11 shows the correlations between the two statistical descriptors for both variables (R 2 is equal to 0.73 and 0.77 for γ 1 and g 2 , respectively); therefore, the suitability of selecting dη/dt to classify ocean tides in the world's coastal areas is verified.
The above analysis shows that a very good tidal characterization dominates the global map, even in the most problematic areas.Hence, it can be concluded that the developed tidal classification represents the astronomical tide conditions throughout the world's coastal areas with high reliability.

Discussion
It has been effectively demonstrated that tidal asymmetry evolves over relatively short distances in coastal, estuarine, and river environments and influences the behaviors of these areas (Hoitink et al., 2006).The characterization of tidal asymmetry in open oceans is a very relevant aspect for the study of estuarine processes since tidal asymmetry constitutes the boundary condition that influences the propagation of tidal waves throughout an estuary and, consequently, the transport processes that occur within the estuary, as well as estuarine-ocean exchanges (Ranasinghe & Pattiaratchi, 2000).In this study, the most recent version of the TPXO Global Tidal Solutions (TPXO9-atlas), statistical methods (PDFs), and clustering and harmonic approaches were used to effectively describe the tidal asymmetry in coastal areas before the propagation of tides into estuaries.Thereby, results of this study facilitate research from complementary statistical and harmonic perspectives (Guo et al., 2019) and simplify the study of the evolution of tidal asymmetry when a tide propagates through any estuary in the world by analyzing only the responsible constituents.
The high quality of the information provided by the global classification of tides was demonstrated with the excellent agreement obtained in 97% of the nodes in which the PDFs were compared (Figure 9) and in 87% of the nodes in which the tidal regimes were checked (Figure 10), what also confirms the accuracy of the new TPXO9-atlas solution to characterize the coastal tides.The common denominator of the few nodes in which the distributions are not well characterized is their location, that is, areas exhibiting a complex geometry (areas too narrow or having significant depth changes, such as the interiors of estuaries) that even the local models of TPXO9-atlas do not describe with adequate precision.These features have significant local effects on tidal conditions, since they can induce overtides and compound tides which influence the resulting asymmetry and cannot be adequately reproduced by the TPXO tidal models or generate areas where different tidal typologies converge.Therefore, to characterize the tide anywhere in the world, it is recommended to use those nodes which are far enough from these complex areas and propagate the tide from well characterized areas if necessary.
As shown in section 3 (Results), symmetric tides cover 77.4% of the world's coastal areas, and most of these tides correspond to the mixed semidiurnal regime (60%), followed by the semidiurnal regime (30%) (Figure 6).The tidal asymmetries in the estuaries located in these areas depend exclusively on overtides and compound tides generated during inland propagation without being conditioned by external tidal features.On the other hand, although asymmetric tides constitute a smaller percentage, identifying them correctly is essential for describing any estuarine process dependent on tidal hydrodynamics since the identification of asymmetric tides can have very different consequences from the predictions derived from an assumption of tidal symmetry.The most asymmetric tides, both positive and negative, are concentrated in relatively adjacent areas of the world and mainly correspond to diurnal and mixed diurnal regimes (Figure 6).The reason for these maximum tidal asymmetries in the open ocean is fundamentally the interactions among the main tidal constituents.The astronomical triplet O1/K1/M2 dominates in the D, MD, and MS regimes, followed by the P1/K1/S2 and M2/M4 combinations, whereas the main contribution to the S regime comes from the M2/M4 constituents (Figure 7a); these results are consistent with the findings of Song et al. (2011).Positive asymmetries characterize 11.3% of the world's coastal areas and influence the 10.1029/2020JC016143 Journal of Geophysical Research: Oceans behaviors of estuaries dominated by tides that exhibit flood dominance in their mouths.A good example of this type of environment is the Dee estuary (United Kingdom), which shows an overall flood dominance that is likely induced by positively asymmetric ocean tides, which explains the known historical changes in the estuarine morphology (large-scale accretion over the last 2 centuries) (Moore et al., 2009).Negative asymmetries are also present in the same percentage as positive asymmetries (11.3%) and can influence estuaries with external ebb dominance.Ranasinghe and Pattiaratchi (2000) studied three West Australian inlets whose mouths are dominated by ocean tides with long-term negative asymmetry.The authors showed that, despite the alternation of ebb-and flood-dominated periods, over a period of 1 year (to investigate the long-term effect), the net transport of sediment through the entrances of these systems was directed towards the sea.Nidzieko (2010) investigated three California estuaries with negative tidal asymmetries imposed in their mouths and demonstrated that such asymmetries increase or decrease (according to the estuarine morphology) as the tide propagates landward and therefore affects several internal processes.Similarly, Ferrarin et al. (2015) and Finotello et al. (2019) point out that the inlet region and the main channels of Venice lagoon were dominated by the negative tidal asymmetry of the North Adriatic Sea.Ferrarin et al. (2015) obtained the annual value of γ 1 in the period 1976-2014 with the data of a tide gauge station located 15 km offshore from the Venice lagoon inlets and verified that γ 1 oscillated around an average value of −0.09, coinciding with the results of the classification carried out in this study.As these authors concluded, although tidal asymmetry develops according to the estuarine morphology as the tide propagates, the asymmetry imposed at the mouth of the estuary must first be overcome.

Conclusions
This research addresses the study of tidal asymmetry on a global scale, based on a statistical approach adequate when asymmetry results from the interaction among multiple tidal constituents.Astronomical tide has been classified by applying a clustering K-means technique to the TPXO9-atlas solution, a complete (15 tidal constituents) barotropic tidal solution with a high spatial resolution (1/30°) to describe the tidal conditions in the vicinity of coastal areas.Accordingly, this study increased the knowledge available in the current state of the art about the long-term tidal distribution on a global scale.
Twenty-five astronomical tide types were identified (through PDFs of the tidal elevation time derivatives), and the main tidal constituents associated with each tide type were estimated.The classification was validated by comparing the outputs with the data from 757 tide gauges strategically distributed to cover the world's coastal areas.The results showed that both the open ocean and the coastal areas are dominated by symmetric tides (85.3% and 77.4%, respectively); negative asymmetries characterize 8.2% of open oceans and 11.3% of coastal areas; and positive asymmetries cover 6.5% of open oceans and 11.3% of coastal areas.
In general, the astronomical triplet O1/K1/M2 controls the tidal asymmetries in the diurnal, mixed diurnal, and mixed semidiurnal regimes, whereas the M2/M4 pair of constituents is the combination that stands out for most semidiurnal regimes.
The developed classification represents a guiding framework regarding the types of astronomical tide according to their asymmetry (orientation and degree) and periodicity on a worldwide scale.This framework can be used as a reference for further research on the transport of substances (e.g., sediments, nutrients, plastics, oil spills, and chemicals) in estuaries, as the asymmetry imposed at the mouth of an estuary may influence most internal transport processes.

Figure 5 .
Figure 5. Representativity of the number of clusters (N): (a) scatter plots of γ 1 -g 2 associated with different TR from TPXO9-atlas (color dots) and from the clustering (dots inside the black circles) for different N and (b) efficiency coefficient (CE) of γ 1 /g 2 /TR.

Figure 7 .
Figure 7. (a) Contributions(β i) of the combinations of tidal constituents to the skewness of the ATtypes (C i ) (the greatest contributions that explain the γ 1 -g 2 pair are framed in gray); (b) scatter plot of γ 1 (from clustering)-γ 1 (resulting from the main tidal constituents that explain the PDF shape); and (c) scatter plot of g 2 (from clustering)-g 2 (resulting from the main tidal constituents that explain the PDF shape).Each C i on the y-axis is represented by a color associated with a tidal regime, namely, blue for S, green for MS, yellow for MD, and red for D.

Figure 9 .
Figure 9. Efficiency coefficient (CE) between the PDFs from clustering and the PDFs from the tide gauge records worldwide (a) and in the following areas: the Adriatic Sea (b), the central East Coast of the USA (c), the coastlines of Ireland and the United Kingdom (d), the northern Brazilian coast (e), and the Japan coast (f).Some examples of PDFs that show different levels of agreement in these areas are labeled c i , d i , e i , and f i .

Figure 10 .
Figure 10.Matched/mismatched indicators between the tidal regimes (TR) from clustering and the tide gauge records worldwide (a) and in the following areas: the Gulf of Mexico and southeast coast of the United States (a1) and the Sea of Japan and East China Sea (a2); (b) scatter plot between the tidal form factor (F) from tide gauges (GESLA-2) and from TPXO9-atlas (mismatched points are highlighted in red).

Figure 11 .
Figure 11.Relationships between the statistical parameters γ 1 and g 2 for dη/dt and u (tidal currents in the average direction of tidal propagation).

Journal of Geophysical Research: Oceans NÚÑEZ ET AL.
results indicate that symmetric tides (|γ 1 | < 0.1; from C9 to C19) dominate 85.3% of the world's open ocean, as well as 77.4% of all coastal areas.Indeed, symmetric tides correspond mainly to the mixed semidiurnal regime (approximately 60% of symmetric tides for the open ocean and coastal areas) and semidiurnal regime (approximately 36% and 30% of symmetric tides for the open ocean and coastal areas, respectively).A significant part of the Atlantic Ocean, including the East Coast of the United States, the western coast of Spain, and large proportions of the Brazilian and African coasts, exhibits these ATtypes (light-colored areas in Figure6b).Conversely, 14 ATtypes show a clear asymmetric component (|γ 1 | > 0.1), where the greatest asymmetries (|γ 1 | > 0.5) occur in the diurnal and mixed diurnal regimes (i.e., C1, C2, C24, and C25).On the one hand, positive asymmetries (from C1 to C8) are found in 6.5% of the open ocean, and they make up 11.3% of the coastal areas throughout the world.The western Gulf of Mexico, the northern Caribbean Sea, the Mediterranean coast of France, part of the Spanish coast, and the south coast of Australia are good examples of these ATtypes (warm-colored areas in Figure