On the use of Hidden Markov Processes and auto-regressive filters to incorporate indoor bursty wireless channels into network simulation platforms

In this paper we thoroughly analyze two alternatives to replicate the bursty behavior that characterizes real indoor wireless channels within Network Simulation platforms. First, we study the performance of an improved Hidden Markov Process model, based on a time-wise configuration so as to decouple its operation from any particular traffic pattern. We compare it with the behavior of Bursty Error Model Based on an Auto-Regressive Filter, a previous proposal of ours that emulates the received Signal to Noise Ratio by means of an auto-regressive filter that captures the “memory” assessed in real measurements. We also study the performance of one of the legacy approaches intrinsically offered by most network simulation frameworks. By means of a thorough simulation campaign, we demonstrate that our two models are able to offer a much more realistic behavior, yet maintaining an affordable response in terms of computational complexity.


Introduction
Wireless technologies are constantly evolving and have become an essential part of everyday life.In particular, the birth and rise of the IEEE 802.11-compliant technologies has led to a remarkable increase of the popularity of wireless local area networks.As a consequence, the research community needs to address the multiple challenges posed by this particular type of communications.Although the empirical experimentation using off-the-shelf technologies seems to be the natural way to analyze the performance of such technologies, it also has clear limitations (for instance scalability and reproducibility).These drawbacks, amongst others, bring about the need of simulation methodologies.
On the other hand, one of the strongest arguments against simulation techniques advocates the low accuracy of the most widespread propagation models, whose operation is usually ''simplified''.Although there exist advanced (and complex) approaches that could provide a higher level of accuracy (for instance ray modeling or electromagnetic theory), they require a rather long simulation time.This prevents their use within network simulation platforms, which are focused on the upper layer protocols, algorithms and mechanisms.Hence, the main reason behind the abstraction of the physical layer complexity in a network simulator is to ensure the scalability of the scenarios that are prone to be deployed, in which the number of elements might considerably grow.Hence, it is deemed necessary providing physical-level mechanisms able to reflect a behavior close to the one exhibited by real channels, in a reasonable amount of time, showing a good trade-off between accuracy and simulation complexity.
Although we can find IEEE 802.11 networks almost everywhere (even in open areas), indoor environments appear as the the most prevalent scenario for this type of technology.In this sense, many works have assessed the performance in an indoor environment, being one of the most remarkable findings the harshness of the wireless channel, due to the presence of walls, furniture, people moving, etc.Several of these studies highlight that there is a clear memory effect within these particular scenarios, since consecutive frame error events are not independent, and tend to happen in bursts.
Besides, the ever-increasing computing capacity of devices has led to the development of advanced techniques to replicate the behavior of real networks.As a consequence, novel and more powerful simulators, like ns-3 [1], the natural successor to the popular ns-2, have become available.However, the mainstream propagation models are still far from being realistic, keeping the same drawbacks as its predecessors.
In this work we aim at mimicking the behavior of a real indoor channel, which was thoroughly studied by means of an empirical campaign.The results obtained from this analysis will be used to tune the performance of two different proposals to emulate a wireless channel model, whose operation is completely dissimilar; whilst the former relies on a time-based configuration of an HMP, where the transitions between the states in the Markov chain can be detached from the discrete events generated by the reception of a frame in a particular node (to our best knowledge there are no work having used the same approach to configure a HMP based model).On the other hand, the second one, Bursty Error Model Based on an Auto-Regressive Filter (BEAR), employs an Auto-Regressive (AR) filter to estimate the received signal strength and to reflect the memory effect observed over real channels.Both models are able to replicate the bursty behavior exhibited over real indoor scenarios.We also compare them with one of the legacy ns-3 simulator alternatives that, although providing accurate error ratios and throughput average values, is not able to replicate the memory effect that was assessed over a real scenario.We also evaluate the impact of these models over the TCP performance, studying scenarios in which performance is seriously jeopardized when it is used over wireless channels.Last, we also compare the computational complexity of the three approaches since, as mentioned earlier, there must be a trade-off between accuracy and complexity, especially when the number of wireless links is large.
The remainder of this paper has been structured as follows: Sections.2 and 3 describe the two models that mimic the bursty behavior exhibited by real indoor wireless channels (Hidden Markov Process (HMP) and BEAR, respectively).Section 4 introduces an alternative that is originally provided by the ns-3 simulator.Section 5 outlines the simulation campaign carried out to compare the performance of the three different solutions, whose main findings are also discussed.Afterwards, Sect.6 positions this work with other contributions that have tackled the modeling of indoor wireless channels.Finally, Sect.7 concludes the paper, advocating some issues to be addressed in our future work.

Channel model based on a Hidden Markov Process
The use of HMP techniques to mimic real processes has gained popularity since their first application in the 1960s.Countless research applications have bloomed since then: we can find HMPs within speech recognition applications, to predict the location of people based on their habits or even to be used for novel bio-informatics studies, such as the analysis of bio-segments (for example gene prediction or protein folding).
We can define a HMP as a discrete system with N independent states (S i , where i is the index of each of the states).The transitions between them follow a set of stochastic probabilities, which are referred to as transition probabilities, and are represented as a i;j , probability of moving from i (current state) to j.Another set of probabilities is used to characterize the decisions within the states, mapping them with the possible output values of the system (observables); they are defined as b i ðkÞ, where i refers to the particular state, and k establishes the corresponding output symbol.It is worth mentioning that in HMP, unlike legacy Markov models, the states are ''hidden'', since each of them does not yield only one single output value, but there are various possibilities (each of them with a probability given by b i ðkÞ).Last, we also need to establish the initial state of the system; for that purpose, the vector P ¼ fp i g defines the probability of being at the ith state when the system gets started.
Taking into account how the model is implemented, we are able to define a complete HMP channel by means of the following elements.

5.
The initial probability distribution of being at each state, P ¼ fp i g.For the sake of simplicity, we will assume that p i ¼ 1 N , and therefore the initial state will be randomly selected.
In order to configure this model, we used some real traces, obtained over a real indoor channel.The experiment consisted in the exchange of a 10 MB file between two nodes (i.e.server-client) over an IEEE 802.11 ad-hoc cell, through a direct single-hop link.The location of the stations did not change throughout the experiments, with a separation of % 15 m between them, without line of sight, and with both metallic obstacles and people moving within the scenario (typical office environment).Both of them used WaveLAN 11 Mbps Lucent/Orinoco PCMCIA cards, configured in a proprietary Ad Hoc (pseudo-IBSS) mode which did not use management frames; we fixed the maximum data rate of 11 Mbps during all the assessment.Additionally, the corresponding wireless card driver was modified so as to be able to track whether each incoming frame was corrupted (CRC failed) as well as the received Signal to Noise Ratio (SNR).The maximum number of transmissions for an IEEE 802.11 frame was fixed 1 to four and the RTS/CTS mechanism was disabled during the experiments.Last, but not least, we ensured that the presence of IEEE 802.11 traffic from other networks was negligible during the whole campaign.In the UDP case, we sent 10000 UDP/IP unicast datagrams, with 1,472 bytes of payload, saturating the wireless link; to generate TCP traffic we used FTP to transfer a file of 10 Mbytes.TCP Reno was used, with the Selective Acknowledgment and Timestamp options enabled; the Maximum Segment Size was 1,448 bytes, to maintain the same frame length that was used in the UDP case.The reader might refer to [3] for a more thorough description of the overall process.
For each of the 15 experiments (whose complete statistics are gathered in Table 1 in [3]), we generated a trace file, having, for every received frame, a 0 value if it was corrupted or an 1 if it was correctly received.Afterwards, with the resulting binary vector, as Fig. 1a shows, the corresponding Markov chain is ''trained'' using the hmmtrain Matlab's function,2 establishing (as an additional constraint) that the resulting chain shall be a birthand-death process.As can be seen, the chain itself is ''hidden'', whilst the ''discrete observations'' (reception events) are the main input arguments.This function returns both the transition and the decision matrices, which will be afterwards used by the ns-3 environment.
On the other hand, the simulator-driven operation, depicted in Fig. 1b, shows a different operation.In this case, the ''visible'' part of the process is the Markov chain itself, and its transition probabilities define the behavior of the model.Furthermore, the ''hidden part'' of the process corresponds to the frame reception decisions: error (0) or success (1).In other words, during the execution flow, the channel will be changing its current state, according to the underlying transition probabilities (a i;j ) and, when a node receives a frame, the corresponding decision probability will establish whether the frame was correct or not, by comparing a random value with the corresponding b i ð0Þ coefficient.
Cardoso et al. [5] also used a set of real traces to configure the different parameters of the HMP.However, there is a rather relevant difference between their approach and ours: in our case the measurements were done by saturating the wireless channel, ensuring that there were always frames to be sent at the transmitter, while that the authors of [5] fixed an interval of 10 ms between consecutive frames.With the corresponding traces, the model presented in this paper was configured by changing the duration of bursts from frames to time.In this sense, we are not bound to use any particular time between consecutive transmissions at the source node, and the behavior of the channel model is orth ogonal to the traffic characteristics; this would bring about the possibility of using it with different types of applications (including TCP-based ones, in which the time between consecutive segments heavily depends on the dynamics of the corresponding congestion algorithms).
In order to complete this time-based model configuration, we need to estimate the pdf of the time spent at a particular state i, which we approximate by a negative exponential distribution, 3 k i , being the average time spent at state i.T i can be calculated using the average number of consecutive frames at each of the states, F i , which can be derived using Eq.(1) where p i ðjÞ is the probability of having j consecutive frames at state i.This set of results represents an illustrative sample of the 15 measurements that can be found in [3], whose trace files are used to Averageconsecutiveframesatstatei Eq. ( 2) can be used to determine the value of T i , the average sojourn time at state i, where w denotes the average inter-frame duration, if it is assumed to be constant.
From all the results obtained over the real-scenario testbed [3], we have trained the corresponding HMP configurations with three illustrative behaviors (selected from the 15 measurements), ranging from a Bad channel, characterized by rather negative transmission conditions, to a Good channel, whose operation gets closer to that which we could expect over an error-free link; an Average channel, representing an average behavior of the channel, was also selected.Table 1 summarizes the main performance values for each of these measurements.In particular, we rely on the use of some performance statistical parameters, which are defined below: • Throughput.Overall performance, calculated as the received information divided by the transmission time.This metric reflects the useful data rate measured at the receiver's application layer.
Fig. 1 Graphical interpretation of an HMP (birth-and-death process).a HMP general overview, b HMP simulator's insight a particular time interval, causing the loss of several consecutive frames (we refer to this event as an error burst).

Dynamic time-basis analysis
As mentioned earlier, the authors of [5] used a frame-based HMP able to mimic the behavior of a wireless channel for a very particular traffic pattern.In this sense, if the data rate generated by the source node was different, the behavior of the channel model would not be appropriate.The corresponding chain would not accurately mimic the real performance, since there is a tight relationship between such behavior and the configuration of the underlying model.
Since we aim at a more generic solution, we modeled the average transmission time per frame.In a first approach, we simply assumed that the time between two consecutive frames was constant for all states, w in Eq. ( 2), no matter the channel quality or the erroneous frame bursts.In order to have a more accurate solution, we modeled the average time between two consecutive frames depending on the current state and the corresponding number of retransmissions.For that purpose, we need to calculate how long it takes (in average) for a frame to be delivered to a receiver node, D k ; considering k retransmissions attempts, as shown in Eq. (3): where the first term corresponds to the deterministic contribution of the IEEE 802.11DCF scheme, while the second term models the average value of the random time caused by the CSMA/CA procedure, which doubles the contention window for every retransmission (binary exponential backoff procedure).In particular, the following parameters are used: • k.This value indicates the number of retransmissions that were sent for a particular datagram.A value of k ¼ 0 indicates that a frame was correctly received at the first transmission.Since the maximum number of retransmissions was 3, k 3. Besides, D 3 does not necessarily imply that the frame was correctly received after the third retransmission, although the overall time (for the four attempts) would be alike.• d c .It is defined as the fixed time per frame, which accounts for the deterministic contributions (Distributed Inter Frame Space-DIFS, transmission time of the data frame, together with the physical header and preamble, Short Inter Frame Space-SIFS and transmission time of the IEEE 802.11ACK, including the physical header and preamble).For the particular configuration that was used during the measurement campaign, the value of d c is % 1:7 ms.• r.This parameter reflects the slot time of the contention window used by the IEEE 802.11DCF mechanism.
Each of the states is characterized by the probability for a frame to be correct (1 À p i ) or erroneous (p i ), where i holds for the current state at the Markov chain and p i ¼ b i ð0Þ.
From these two values we could easily derive the probability that frame requires k retransmissions. 4Hence, the average time per frame that shall be used to translate the state duration to time units can be derived as shown in Eq. ( 4), where R is the maximum number of retransmission attempts per frame, which was set to 3 in our case.
Finally, and considering that the traces that were used to train the corresponding chains were obtained under saturation conditions, we can model the average sojourn time per state T i as shown in Eq. ( 5).

Time-based versus frame-based modeling
One of the most relevant aspects of the configuration proposed herein lies on the fact that it has been done by means of a time-based characterization.As was said before, most of the existing works use the traditional framebased approach, which is sensible only when the simulation conditions are exactly the same as the ones that characterized the real traces used to train the HMP model (e.g. the 10 ms between consecutive transmissions in [5]); otherwise, the model would not be valid.On the other hand, if the configuration followed a time-based operation, the dependency on the traffic pattern would not longer be a problem.
In order to assert that a frame-based operation is not able to correctly capture a change of the traffic pattern, we carried out a complementary analysis.With the Bad channel trace, we configured the HMP using both the frame-based and the time-based configurations.Then, we reduced the application data rate (in the simulator) to 600 Kbps, i.e. without saturating the channel.The average delay between two consecutive receptions would correspond to % 20 ms, opposed to % 2 ms that characterizes saturated IEEE 802.11b transmission.In order to assess the goodness of the results, we synthetically created a trace, by decimating the one corresponding to the Bad channel instance; in this sense we extracted one every ten frames, roughly corresponding to an interval of 20 ms between consecutive frames.Furthermore, as this synthetic trace (only one every ten frames) does not include any 802.11retransmissions, we disabled the corresponding scheme in the simulator, so as to enable a fair comparison.Table 2 shows the behavior exhibited by the two possible configurations and the statistics of the synthetic trace.As can be seen, the frame-based approach resembles the FER quite well, but keeps the memory behavior of the trace it was originally trained with, and therefore the bursts are much longer.On the other hand, the time-based model also mimics quite appropriately the EFB statistics, showing the greater flexibility of this approach.
As an illustrative example, we represent in Fig. 2 the temporal evolution of the HMP state transitions, as well as the frame reception events (plotted as arrows).We can observe that the time-based mode (Fig. 2b) keeps the state change rate along the time, independently of the traffic pattern; transition events are decoupled from the reception of frames and there are cases in which the channel visits and leaves a state within the interval between two consecutive receptions.Hence, the memory effect that was seen over a saturated channel is reduced (as was observed in Table 2), since the reception event of an arbitrary frame might be independent of the previous ones (the ''bursty effect'' disappears).On the other hand, the frame-based operation is tightly coupled on these physical receptions; Fig. 2a shows that, even if the average time between transmissions was modified, the average time per state would be scaled likewise, since transitions are triggered by reception events and the ''mean number of frame receptions the channel will remain at the ith state'' can be expressed as shown in Eq. ( 1), independent of the time.In other words, a frame-based HMP channel model would be tightly linked to the particular conditions that were used to train the model and to obtain the transition and emission matrices, so this approach will always yield a similar output as the one corresponding to the trace file used to ''train'' it.

Channel model based on an auto-regressive filter
Another approach to replicate the behavior of indoor wireless channels is to estimate the signal power at the receiver node.The BEAR model, originally proposed by Agu ¨ero et al. [3] follows this concept.In this work, in order to compare its performance with the one shown by the HMP models discussed in the previous version, we have ported its implementation to the ns-3 network simulator.Figure 3 depicts the BEAR operation, from the physical transmission of a packet to the point at which the receiver entity decides whether that particular frame is correct or not.
The cornerstone of BEAR consists in estimating the received link quality by considering three different contributions.We abridge below their main features: • The first one depends on the distance between the transmitter and the receiver nodes; it is normally characterized by a factor d Àm , where d represents the separation between the two nodes and m is tuned according to the propagation loss model (we could refer to this parameter as ''pathloss exponent'').Within this work we use a simple log distance propagation loss model, which is originally provided by the simulator.• The second component reflects the slow variations on the received signal (Slow Variation-SV) which could be ascribed to the presence of physical obstacles within  the path.In order to mimic such effect, BEAR uses an auto-regressive filter as shown in (6).The corresponding coefficients, a½i were tuned from the results obtained during the empirical campaign, using the Yule-Walker algorithm.As can be seen, the next value of the SV contribution, SV½i, is ''predicted'' from the previous stored samples, SV½i À j, limited by the AR filter order, T; a½j corresponding to the filter coefficients.It is worth highlighting that each of the samples reflects a received frame (the time step would be % 2 ms in the particular configuration we used); in order to decouple the channel model from the traffic pattern (i.e.without saturation conditions), we included a timer, whose expiration would delete the previously stored samples, so that they not longer impact the SNR of new frames.Finally, is a white noise contribution with average power P .The reader might refer to [3] for a more thorough discussion of the operation of this model.
• The latter contribution reflects the multi-path wireless channel nature, leading to fast signal variations.The literature refers to this phenomena as Fast Variation (FV) or shadowing effect.In this work, it will be modeled as a random (i.e.Gaussian) variable with a mean zero and a variance of r 2 dB 2 .
The sum of all these contributions, which are also combined with an equivalent noise power to calculate the SNR, is depicted in Fig. 4: first, the deterministic propagation loss model returns the Received signal strength (RSS) as a function of the distance; besides, we can get the noise floor level from a legacy interference model, which is out of the scope of this work (Fig. 4a).The second contribution is the result of the AR filter, yielding a signal with slow variations along the time (Fig. 4b).The last component reflects a typical shadowing effect, showing a completely random nature (Fig. 4c).As can be seen in Fig. 5, which shows the probability density function (pdf) of the received SNR, the BEAR clearly shows that a different behavior is reflected for correct and erroneous frames.The average SNR for the erroneous frames is around 3/4 dB lower than for the correct ones; this reflects what was observed over real channels [3].
Afterwards, the overall SNR (Fig. 4d) will be the input of a decision entity, responsible for establishing whether • If the RSS is higher than the energy reception threshold, the frame delivery to the upper layers relies on the operation of the new error model, silently passing through the original physical reception.• Instead of using the Bit error rate (BER) curves supported by the simulator, we have incorporated a logistic function, Eq. ( 7), which determines the FER as a function of the SNR; this relationship, as well as all its parameters (a, b, c and the two thresholds: LT and HT) was obtained using a curve fitting tool with the relationship that was empirically observed over the real channel.
• The previous expression is only valid for 1,500 byte frames (worst case), and different relationships should be found for different lengths.For instance, the model considers that all IEEE 802.11ACKs are always correct, since the probability of losing them is much lower than the one seen for data frames.
For a more thorough description of how the real traces are used to identify the AR filter coefficients and the logistic function parameters the reader can refer to [3].

Legacy model supported by the simulator
The last channel analyzed in this work corresponds to one of the mainstream IEEE 802.11 models originally supported by the ns-3 simulator.In this particular case, as shown in Fig. 6, we have configured the lower layer as follows.We have two propagation elements: the former is based on a log distance propagation loss model, and returns a deterministic value as a function of the distance between the nodes.The attenuation factor L is obtained as shown in Eq. ( 8), being L 0 the path loss reference (in dB), m the path loss distance exponent, d the distance between the source and sink nodes and d 0 the reference distance (in meters).Besides, the second contribution mimics a shadowing-FV -effect, which is modeled with a normal random process N (0, r 2 ).
As was done with BEAR, we derive the SNR of the received frame by adding the two aforementioned contributions, and the equivalent noise power (provided by an interference model).This value will be used to find the corresponding FER, using BER curves that are chosen according to the binary rate and the modulation.Once the FER is obtained, it is compared with a random value to decide whether a frame is correct or not.
It is worth mentioning that the operation of this socalled Default model has some similarities with BEAR; the main difference between them is the use of the AR filter in the latter one to emulate the slow variation of the channel.The overall behavior, in terms of the SNR, of the Default approach can be seen therefore as the sum of the deterministic propagation contribution (Fig. 4a) and the shadowing component (Fig. 4c).In addition, Fig. 7 shows that, unlike the BEAR case, there is not a relevant correlation

Simulation setup and results
After describing the different approaches that we have used to mimic the behavior of indoor wireless channels, we describe the setup of the simulation procedure we carried out to study the performance of the three alternatives.A source node sends 10,000 data packets with an MTU of 1,500 bytes, to a receiver entity.We also assume that there is always traffic to be sent at the transmitter, thus saturating the wireless channel.Besides, the transmission power (txPowerDbm) was tweaked to reduce the corresponding coverage, using a value of 0 dBm.Below we depict the configuration for each of the different channel models.
• BEAR uses an order three auto-regressive filter (T ¼ 3) with a white noise power P ¼ 5 Â 10 À3 W=Hz; besides, the FV contribution will be modeled by means of a normal random variable N(0, 2.8 dB 2 ).• Regarding the HMP model, a 4-state hidden Markov chain is used, 5 which was trained with the three traces corresponding to the measurements depicted in Table 1.• Finally, the so-called Default model uses the same shadowing contribution as BEAR: N(0, 2.8 dB 2 ).Furthermore, in order to compare its results to the ones observed with the HMP model, we have mimicked the same channel conditions (i.e.Good, Average and Bad), by changing the distance between the nodes so as to get a similar output in terms of the average FER.
We have carried out four phases: first, we characterize the behavior of the three different solutions using UDP traffic so as to reduce as much as possible the interplay of different upper layer mechanisms; the second one focuses on the addition of a distance-dependent functionality to the HMP model, tuned from the output obtained by BEAR in [6]; the third stage studies the impact of these channels on TCP, assessing its sensitivity to indoor wireless environments.Finally, we compare the three models in terms of their computational cost, studying the corresponding computational time-accuracy tradeoff.

Raw channel characterization
We use UDP, because it is is the most appropriate transport protocol to evaluate the ''raw'' behavior of the lower layers, since the sending pattern is not altered by the wireless channel characteristics, ensuring that the wireless channel stands as the actual system bottleneck.
Figure 8 shows the most relevant statistics to understand the performance of the various models, namely FER, PER and throughput.We represent the cumulative distribution functions (cdfs) of the aforementioned parameters, after carrying out 500 independent simulation trials per configuration.First, the limited variability exhibited by the Default model can be easily seen (Fig. 8a, d, g), with an almost deterministic behavior.On the other hand, all the HMP configurations show a similar performance (see Fig. 8b, e, h), although the PER for the Bad configuration shows a higher variance, due to the impact of the error bursts, as will be discussed below.The third model, BEAR (Fig. 8c, f, i), covers, with just a single configuration, almost the whole range seen over the real testbed.The modification of BEAR's configuration parameters (P for the SV and N 0; r 2 ð Þfor the FV contribution) could yield an even higher variability.
Figure 9 shows the relationship between the FER and the PER for each of the channel models, to illustrate their memory factor.Each figure includes, in addition to the simulation-based results, the values observed over the real wireless channel [3], as well as a curve that represents the behavior of a memoryless channel.In this sense, there was no memory (the random process associated to a frame reception is independent from the previous ones), we could say that PER ¼ FER Rþ1 , since a packet will get lost after the consecutive reception of R þ 1 erroneous frames, being R the total number of retransmission attempts set by the IEEE 802.11 entity (three in this work).In general, we can state that the Packet error rate (PER) could be calculated from the Frame error rate (FER) using Eq. 9, where c gives an idea of the channel's memory impact (the lower c, the higher the memory).As can be seen, all simulations lie within the memoryless behavior, c % 4. On the other hand, BEAR shows a higher variability, spanning (for a single configuration) the whole set of values that were observed during the real measurement campaign.However, there are some particular measurements that are not properly replicated by BEAR, since its c parameter is slightly higher that the one of the real channel.Finally, the HMP offers, considering its three configurations altogether, a reliable modeling of the memory assessed over a real scenario, although the variability is (for each of them) much lower than BEAR's.It can be also seen that the value of c is, for all the HMP configurations, lower than the one seen for the BEAR channel.
As was discussed before, the behavior of the different models in terms of FER and PER has a direct relationship with the ''bursty'' response of the channel.Figure 10 shows the EFB's pdf and complementary cumulative distribution function (ccdf) for the three models.It is worth highlighting that a burst longer than four frames would lead to a packet loss.We can again observe the poor bursty behavior offered by the Default model (Fig. 10a, d), where the vast majority of EFBs are shorter than ten frames; in fact, only the Bad configuration was able to replicate the appearance of bursts longer than five frames.As for the HMP model, its pdf (Fig. 10a) shows that bursts are much higher, having a non-negligible probability for EFBs [ 10 frames, even for the Good configuration.On the other hand, BEAR (Fig. 10c, f) is able to cover (even though for a single configuration) a broader range of behaviors, from short EFBs (% 85 % are shorter than four frames) to long ones (% 5 % are longer than ten frames).
On the other hand, it has been shown [3] that the probability of having an EFB of 100 or more frames over the real channel is actually lower than 0.7 %, and therefore we can conclude that both HMP and BEAR reflect this behavior with a reasonable level of accuracy.Although they yield bursts longer than 100 corrupted frames, their probability is very low: Pr EFB [ 100 f g 10 À3 , for both models.

HMP distance-aware operation
Unlike the legacy wireless channel simulation models, one important shortcoming of the HMP model basic configuration is that it cannot provide any dependency to the received signal strength6 by itself.We have to choose a pair of transmission-emission matrices (A and B) during the scenario setup, keeping those settings throughout the simulation.However, we can improve the operation of this model by exploiting the results achieved by BEAR in [6], which will be used to tune the performance as a function of the distance between nodes.We first ''discretize'' the response along the distance at a finite number of points.By using a distance-based study of the BEAR model, and the corresponding real behavior, we have chosen up to seven different measurements from Table 1 in [3] to configure the various HMP instances.Figure 11 shows the performance, in terms of FER, PER and throughput while we vary the distance between transmitter and receiver.Provided that the distance thresholds were configured according to the FER, we can observe in Fig. 11a that all these values are reliably mimicked, as well as the resulting throughput (Fig. 11c), which covers all the range showcased by BEAR.On the other hand, despite the PER showing an appropriate behavior, as illustrated in Fig. 11b, there are two small ''misalignments'': first, since HMP's memory factor c is greater than BEAR's, the PER is slightly higher, especially for low FER values; on the other hand, the current implementation is not able to cover PER values higher than 0:4.

Impact of a packet erasure channel on TCP
Although the results obtained with UDP traffic are appropriate to reflect the ''raw'' behavior of indoor channels, it is also interesting to assess the impact of these environments on different transport protocols, like TCP.As mentioned before, TCP is not able to determine the cause of a segment loss, either brought about by the congestion of intermediate routers' buffers or as a consequence of the hostile conditions of the wireless channel.The default TCP interpretation is always the same: when a segment gets lost, the TCP entity associates this event to a congestion situation, hence the congestion control mechanisms will act accordingly, by reducing its sending congestion window.Furthermore, this loss of information might lead to the reception of an out-oforder segment event that triggers a Dup ACK backwards delivery.After the consecutive reception of three of them (Triple Duplicate ACK), the Fast Retransmit algorithm will immediately trigger the retransmission of the corresponding segment.In summary, the presence of long error bursts (as observed over the real channel) during a TCP transmission has a huge impact over the system performance.For these reasons, it is essential to provide channel models that accurately capture this behavior, in order to evaluate the performance of such protocols over wireless networks.
Figure 12 shows the relationship between the FER and the throughput for the three studied models, as well as the values observed during the empirical campaign and an upper bound, which is established by means of a memoryless channel.First of all, we can observe the poor performance exhibited by the Default model (Fig. 12a), with a clear memoryless behavior.On the other hand, the different HMP configurations present an acceptable level of variability (Fig. 12b).Finally, BEAR offers again the broadest range of possible outputs, mimicking quite well the memory effect shown by the real measurements.
As said before, TCP uses two retransmission triggers: the first one, the so-called Fast Retransmit algorithm, establishes that a segment must be immediately retransmitted after the reception of a triple duplicate ACK; on the other hand, if, after sending a segment, the transmitter does not receive an acknowledgement within a time interval (the Retransmission TimeOut, RTO), it would be retransmitted.The latter one causes a stronger impact on the TCP performance, since it might lead to long inactivity periods (during which the channel is not used).Provided that the RTO gradually increases (following a binary exponential backoff algorithm) after any timer-triggered retransmission, these idle times might reach rather long values.Hence, they could severely jeopardize the overall TCP performance.Figure 13 shows the maximum idle time cdf for the different channel models.As can be seen, the Default approach shows a complete lack of bursty behavior, and the idle times always stay below 2 s (even for the Bad configuration); on the other hand, HMP yields a rather predictable behavior for the Good and Average configurations (with values much lower than the ones observed over the real channel), and only the Bad instance leads to idle times greater than 5 s, but at the expense of causing inactivity periods greater than 60 s in around 4 % of the cases, which do not accurately reflect the real channel behavior.Besides, BEAR appropriately mimics the variability assessed during the characterization carried out over the real channel.
Finally, it is interesting to analyze the temporal evolution of some illustrative individual measurements.For that we will represent the ''Time-TCP Sequence Number'' graph of some particular experiments.Figure 14 shows (per channel model) one good and bad example.At first glance we can conclude that the Default model does not provide any variability at all (as was also discussed earlier), since the behavior remains almost alike for the 500 independent simulation trials.On the other hand, we can see that the other two models are actually able to lead to TCP connections with a rather opposite behavior, as was the case for the real channel.

Computational cost assessment
In this last phase we aim to characterize the computational cost (in terms of simulation time) for each of the wireless channel models studied in this work.For this purpose, we need to make various changes on the scenario, as described below.
1.The parameter that is modified for this particular analysis is the number of nodes deployed along a line topology.The first node is the source and the last one takes the receiver role.We increase the number of nodes from 2 to 32. 2. All the configurations present a common aspect: besides the particular operation of the proposed models, we have added another propagation loss entity, a range distance propagation loss model.It defines two different thresholds: the first one limits the radius within which every frame will be successfully received (when P RX [ RX threshold ); on the other hand, a second one will be used to establish the distance that limits the Carrier sense (CS) threshold, zone in which   the node will sense that another transmission is happening and will therefore deter its own transmission until it is finished (RX threshold [ P RX [ CS threshold ).Out of these zones, frames will always be corrupted (P RX \CS threshold ), to ensure that a packet needs N À 1 hops to reach the destination and at the same time to avoid the ''hidden-terminal'' effect, since the ith node will be able to overhear the transmissions carried out by up to its two-hop neighbors.3.During the simulations, 5000 UDP datagrams are sent between the source and sink nodes, using a constant bit rate (CBR) application that delivers packets at a rate of 100 Kbps (non-saturating conditions).4. The remaining configuration parameters keep the values chosen in the previous simulation campaigns.5. Finally, we carried out a total of 25 independent runs for each of the scenarios.
Figure 15 represents the normalized simulation time of each of the runs (using the lowest value as the reference: two nodes, Default channel), as well as the 95 % confidence intervals (as can be seen, the variability of the results is almost negligible), as a function of the number of nodes deployed along the line topology.According to the obtained results, it is easily inferred that the Default model requires less time to decide whether a received frame is correct or not, since its complexity is rather low: after calculating the received signal power and the SNR from the propagation and interference models, respectively, the error model maps the SNR into a FER value, based on the BER curve associated to the appropriate modulation scheme [11 Mbps-complementary code keying (CCK) in our work] [7,8].The HMP model has a slightly higher cost although the simulation time keeps an acceptable growth rate with the number of nodes.The complexity of this model lies on the matrices (transition and emission) that define the behavior of the wireless channel.Upon the reception of a frame by a node, the first step consists in looking up the current state of the Markov chain at the receiver node; after that, the error model takes the emission error value b i ð0Þ belonging to such state and decides whether the frame is correct or not.It is worth mentioning that the change-of-state process is completely orthogonal to the reception of frames, since a transition will be triggered by a negative exponential random variable, whose mean value is obtained as shown in Eq. ( 5).Finally, BEAR is the model that shows the longest simulation time, being penalized as the number of nodes gets higher.In a nutshell, besides the operation carried out by the Default model, BEAR needs another signal contribution (i.e. the SV one), obtained by means of the AR filter, which stores the last T samples of the SNR values and operates with them following Eq.( 6).Upon the calculation of the overall SNR value, a logistic function is used to decide whether the received frame is correct or not.

Discussion
At this point, we can summarize the main conclusions extracted throughout this simulation campaign: First, we have checked that, although the Default model captures quite appropriately the behavior of a wireless channel in terms of average FER and throughput, it is not able to reflect the memory factor observed over real scenarios, leading to an almost deterministic output, far from being realistic.Although its computational complexity is very low, its use is not recommended for indoor environments that require modeling of a wireless channel with memory.On the other hand, the BEAR model provides a remarkable performance, covering the whole range of results observed over the real channel with a single setup (i.e. the AR filter noise input P and the variance of the shadowing model); however, this model is complex, and its simulation time is long, especially when the number of nodes gets higher.The HMP model provides an interesting trade-off, since it offers a good degree of variability, showing an appropriate memory behavior, while keeping a reasonable computational growth rate (as a function of the number of nodes deployed over the scenario).That, together with the feature proposed in this work that provides the model with distance-dependent behavior makes the HMP model a very appealing alternative.
Another aspect that should be looked at, especially considering the empirical nature of both models, is how reproducible they are and how they can be configured for different conditions (other IEEE 802.11 variants, different frame sizes, etc).BEAR has two main parts: the modeling of the SNR and the dependency between this and the probability for a frame to be erroneous; it can be said that the SNR is independent from the frame size and the type of IEEE 802.11 modulation scheme, as it is estimated during the PLCP [9]; hence, in order to include these into the model we should just find an appropriate match between the SNR and the FER (this could be done, for instance, by means of lookup tables).In this sense, Lertpratchya et al. have recently used BEAR to study the bursty behavior of wireless channels [10].On the other hand, the HMP model could be more complex to be updated so as to consider different frame lengths and would probably require additional measurement campaigns; the training of the corresponding markov chains is rather systematic, though, and therefore the same methodology could be used to consider different frame lengths or modulation schemes.Nonetheless, the time-based configuration would still be of greater relevance so as to decouple its operation from the particularities of the traffic patterns.

Related work
The first works within this research line focused on the empirical characterization of IP protocols performance over wireless local area network (WLAN), using the AT&T's WaveLAN wireless network adapters, which appeared even before the approval of IEEE 802.11 standard, in 1997.Within this group we can find the one carried by by Eckhart et al. [11], where, from a set of packet traces captured at a receiver entity, including information of both the signal and noise levels, as well as the presence of errors, they assessed the influence of different interference and attenuation sources in terms of both packet and bit error rates.At the same time, Nguyen et al. [12], following a similar methodology, aimed at finding a realistic model to emulate the behavior of a wireless channel.Starting from both the error rates and the burst lengths, they proposed an enhanced 2-state Markov model, in which they substituted the traditional geometrical distributions used to model the time spent at each state with other approaches, which mimicked more accurately the empirically observed ones.
Other works focused on the characterization of an 11 Mbps IEEE 802.11b channel; it is worth highlighting the research done by Ikkurthy and Labrador [13], in which they studied the effect of errors over a coded video (using the widespread MPEG-4 video compressor) transmission.They carried out an experimental campaign in which they modified the packet size, and analyzed the erroneous and correct packet bursts, and their probability distributions, for a scenario where both nodes were separated a distance of approximately 22 m.Comparing these results with the ones gathered by Nguyen et al. [12] at 2 Mbps, they came to the same conclusion: a simple geometric model does not precisely reflect the real behavior.They also concluded that for 1,500 bytes packets, 90 % of the error bursts are shorter than four packets.Nonetheless, the authors did not specify the number of MAC retransmissions which were used during their measurements.In fact, they coined the term error bursts, whilst a more precise expression would have been packet losses.
Furthermore, there are a number of works which state that traditional channel models based on Markov chains (Gilbert-Elliott) are not able to reliably reflect the behavior observed over real indoor wireless environments.In [14] the authors show that this model is not able to reflect time periods with a high frame loss rate, which might have a strong effect over video transmission, in terms of the quality perceived by the end user as well as to streamline the design of appropriate error control procedures.In [15] the Gilbert-Elliott model was used to establish the most adequate parameters to reflect the perceived quality of a voice transmission using an adaptive frame error correction (FEC) scheme, setting out the need of a research effort to come with more realistic channel models.
More recently, the authors in [16] rely on the results gathered from an experimental campaign carried out over an outdoor rural environment, in order to propose a model able to mimic the observed FER; they also identified the need of conducting a similar analysis over indoor scenarios.Finally, Cardoso et al. [5] question the appropriateness of a 2-state Markov chain to reflect frame loss processes which are seen over real indoor IEEE 802.11 channels.They propose and evaluate a novel model based on an HMP.Although the use of HMP to model wireless channels was already discussed by Turin and van Nobelen [17] and Zhu and Garcı ´a-Frı ´as [18], to our best knowledge, one of the first works proposing their use within Network Simulation platforms was that of Cardoso et al.Their results are compared with a batch of traces obtained from a set of experiments carried out at a constant bit rate, and without considering the IEEE 802.11 retransmission scheme.In addition, they established the traffic pattern at the source node, having a fixed interval (at the application layer) of 10 ms between consecutive packets, which is rather high if compared with the average IEEE 802.11 time gap between two consecutive transmissions (i.e. % 2 ms for an IEEE 802.11b saturated channel at 11 Mbps).Besides, they did not include any reference to the scenario topology (i.e.distance between the two nodes) nor to the received SNR.On the other hand, both the FER and the burst lengths they obtained are considerably lower than those we aim at modeling herein.They conclude that an 11-state HMP-based model, with a birth-death structure was able to reflect (quite accurately) the first and second statistics of the packet losses measured over a real testbed.However, as already discussed in Sect.2, this model is not able to reflect a realistic behavior under different traffic conditions than those that were used to configure the HMP.
Besides, with the main goal of overcoming some of the main wireless modeling drawbacks, we proposed a new channel model: BEAR [3].As was already mentioned throughout this document, it is based on the modeling of the SNR, resembling a set of traces obtained during an extensive measurement campaign carried out over a real indoor scenario.Its most distinguishing feature is that it aims to reflect the memory effect shown over a real channel, using an auto-regressive filter.We compared its performance with other alternatives, widely used in the literature (all of them showing a memoryless behavior) as well as the traditional Gilbert-Elliott model.BEAR outperformed the rest of the channels studied by the authors, but none of them was characterized by offering a memory behavior.
In what respects to ns-3 [1], the simulator framework that is likely to be prominent in the near/mid term, the mainstream available models for wireless channels [7,8] are based on the usage of BER curves, as a function of the RSS.Although they perform quite well in terms of FER and throughput, they are not able to appropriately reflect the bursty nature of real indoor wireless channels.There have been a number of proposals to overcome the limitations of these legacy wireless channel models.Papanastasiou et al. [19] challenge the suitability of this particular simulator and other alternatives (i.e.QualNet), advocating that, despite the upper layers are accurately implemented, little attention has been usually paid to the physical behavior, taking many abstractions and simplifications.Hence, they propose a clean-slate alternative to the legacy models supported by ns-3.The authors present a fullyfledged bit-level physical layer emulator tuned for the orthogonal frequency division multiplex (OFDM) based IEEE 802.11 transmissions.Although this approach could get closer to the real behavior of wireless channels, its intrinsic complexity penalizes the time required to perform the simulations, which might be prohibitive over scenarios with a large number of deployed nodes and traffic flows.
Last, but not least, Al-Bado et al. [20] used an extensive empirical campaign over a real indoor scenario to propose a new ns-3 wireless channel model, tailored from the frame detection rate (FDR), as well as the capture and interference patterns observed over the real measurements, for different physical rates (i.e.IEEE 801.11g at 6, 24 and 54 Mbps).Although they share the same empirical approach that we exploited to configure the two channel models, they focus on other aspects, rather than the channel bursty behavior.In particular they pay attention to both the interference and the capture effects and their model is tightly related to their testbed.It should not be too complicated to include those effects (especially the interference) within BEAR and this might be an interesting point to tackle in our future research.

Conclusions
In this work we presented two different wireless channel models, tailored from the results obtained over a real indoor testbed, following two complementary approaches: the first one (BEAR) estimates the RSS by means of an auto-regressive filter, whilst the second one (HMP) ''discretizes'' the error response of the wireless channel in a finite number of states, building a hidden Markov chain.The cornerstone of both models is that they aim at reflecting the bursty nature that characterizes real indoor wireless channels, whose memory behavior is usually disregarded by the vast majority of simulators, usually providing a rather predictable behavior.
We have carried out an extensive simulation workout to characterize the behavior of these models using a naive transport-layer protocol, UDP, to study the raw performance leveraged by the lower layers.Under these assumptions, we have observed that, although the Default model provides acceptable results in terms of the average FER and throughput, it exhibits an almost deterministic behavior.This demonstrates that this sort of models fail to capture the memory effect, and thus every frame reception can be considered as an independent event.On the other hand, HMP and BEAR, besides being able to mimic the average performance (FER and throughput), they yield a much broader range of outputs.Regarding their bursty behavior, we have seen that both of them adequately reflect the results observed during the real measurements, capturing as well the expected behavior in terms of EFBs.
One of the most obvious limitations of the legacy HMP models is their lack of dependency to the received signal quality, since the corresponding matrices (A and B) are chosen offline (before simulation starts).To overcome this limitation, we have added the possibility to dynamically change the HMP coefficients according to the distance between source and destination.We have taken the BEAR performance to tailor the thresholds between which the distance/HMP coefficients bindings are done.Regarding the obtained results we can assert that, although we are limited to a low finite number of configurations, the broad range of behaviors brings about the possibility of providing a dynamic-range model.
After the analysis of the lower layers (and the channel itself) raw performance, we have assessed the impact that these bursty channel models have over connection-oriented protocols, in particular TCP.Its performance is severely damaged, since its intrinsic congestion control mechanisms are extremely sensitive to consecutive segment losses, thus jeopardizing the overall throughput.In this study, we have asserted the almost null variability and bursty effect provided by the Default model, making it completely unsuitable to study the performance of TCP-based applications over indoor wireless channels.On the contrary, with both BEAR and HMP, the simulation results showed a broad range of outputs, as well as an appropriate memory behavior.Besides, these models are also able to reflect the harmful effect brought about by long EFBs, leading to remarkable idle times at the transmitter.
Finally, we have also analyzed the computational complexity of the three different channels.The Default model shows the lowest simulation time, but this does not compensate its lack of accuracy.On the other hand, we found out that BEAR requires the longest time, leaving the HMP model as an intermediate solution which, together with its rather realistic performance, shows a reasonable complexity, making it attractive on scenarios with a large number of nodes.
Regarding the future work, the most straightforward aspect to be mentioned is the fact that the analyzed channel models can be exploited to evaluate various techniques, algorithms and protocols, including cross-layer techniques.
In particular, we plan to use them so as to study the performance of Network Coding techniques, focusing on the impact of errors bursts over the performance gain that those techniques might bring about.Furthermore, there are still a number of open issues that could be tackled in the future in order to enhance and extend the functionalities of the proposed channel models, as described below.
• First we would like to adapt our models to more recent IEEE 802.11 physical specifications (i.e.g/n/ac).In order to be able to appropriately tune their different configuration parameters, such as HMP's matrices and BEAR's AR filter coefficients and logistic functions, we first need to carry out a measurement campaign over a real indoor scenario for each of the IEEE 802.11 recommendations to be mimicked.• Another interesting aspect would be to evaluate the performance of these models over different conditions, such as number of nodes or traffic patterns.• It is worth highlighting that our models disregard the interference contribution produced by contention (and collisions) with other IEEE 802.11 stations, coexisting 2.4 GHz radio technologies over the coverage area, etc. Actually, they rely on the legacy interference model helpers provided by the simulator, whose operation is currently under development in [21].The interaction between these physical-level solutions shall be addressed in order to create a holistic solution in the future.
Last, but not least, all the information regarding the two proposed models (both HMP and BEAR) have been made available to the scientific community [22].We strongly encourage the interested readers to download the code, assess the suitability of the models, and use them for their own research, as this would help us to improve them by means of an active feedback.

1 .
Number of states in the model, N. 2. Number of possible output values, M; in this work, there will be only two: correct or erroneous frame.3. Transition matrix (A), with dimension N Â N, containing all the state change probabilities, a i;j .4. Emission matrix (B), with dimension N Â M. Each element represents the probability of having output k at state i, b i ðkÞ.

Fig. 2
Fig. 2 Accuracy loss of the frame-based mode upon non-trained traffic conditions.a Frame-based mode, b Time-based mode

Fig. 7 Figure 9
Fig. 7 Default model SNR pdf of an arbitrary transmission

Fig. 11 Fig. 12 Fig. 13
Fig. 11 HMP behavior as a function of the distance.a FER cdf, b PER cdf, c Throughput cdf

Fig. 15
Fig.15 Computational cost comparison between the studied channel models as a function of the number of nodes over a line topology

Table 1
UDP performance over a real indoor wireless channel

Table 2
Statistics without saturating the link (Bad channel)