Sample entropy

Last updated February 20, 2025

Sample entropy (SampEn; more appropriately K_2 entropy or Takens-Grassberger-Procaccia correlation entropy ) is a modification of approximate entropy (ApEn; more appropriately "Procaccia-Cohen entropy"), used for assessing the complexity of physiological and other time-series signals, diagnosing e.g. diseased states.^[1] SampEn has two advantages over ApEn: data length independence and a relatively trouble-free implementation. Also, there is a small computational difference: In ApEn, the comparison between the template vector (see below) and the rest of the vectors also includes comparison with itself. This guarantees that probabilities $C_{i}'^{m}(r)$ are never zero. Consequently, it is always possible to take a logarithm of probabilities. Because template comparisons with itself lower ApEn values, the signals are interpreted to be more regular than they actually are. These self-matches are not included in SampEn. However, since SampEn makes direct use of the correlation integrals, it is not a real measure of information but an approximation. The foundations and differences with ApEn, as well as a step-by-step tutorial for its application is available at.^[2]

SampEn is indeed identical to the "correlation entropy" K_2 of Grassberger & Procaccia,^[3] except that it is suggested in the latter that certain limits should be taken in order to achieve a result invariant under changes of variables. No such limits and no invariance properties are considered in SampEn.

There is a multiscale version of SampEn as well, suggested by Costa and others.^[4] SampEn can be used in biomedical and biomechanical research, for example to evaluate postural control.^[5]^[6]

Definition

Like approximate entropy (ApEn), Sample entropy (SampEn) is a measure of complexity.^[1] But it does not include self-similar patterns as ApEn does. For a given embedding dimension $m$ , tolerance $r$ and number of data points $N$ , SampEn is the negative natural logarithm of the probability that if two sets of simultaneous data points of length $m$ have distance $<r$ then two sets of simultaneous data points of length $m+1$ also have distance $<r$ . And we represent it by $SampEn(m,r,N)$ (or by $SampEn(m,r,\tau ,N)$ including sampling time $\tau$ ).

Now assume we have a time-series data set of length $N={\{x_{1},x_{2},x_{3},...,x_{N}\}}$ with a constant time interval $\tau$ . We define a template vector of length $m$ , such that $X_{m}(i)={\{x_{i},x_{i+1},x_{i+2},...,x_{i+m-1}\}}$ and the distance function $d[X_{m}(i),X_{m}(j)]$ (i≠j) is to be the Chebyshev distance (but it could be any distance function, including Euclidean distance). We define the sample entropy to be

SampEn=-\ln {A \over B}

Where

$A$ = number of template vector pairs having $d[X_{m+1}(i),X_{m+1}(j)]<r$

$B$ = number of template vector pairs having $d[X_{m}(i),X_{m}(j)]<r$

It is clear from the definition that $A$ will always have a value smaller or equal to $B$ . Therefore, $SampEn(m,r,\tau )$ will be always either be zero or positive value. A smaller value of $SampEn$ also indicates more self-similarity in data set or less noise.

Generally we take the value of $m$ to be $2$ and the value of $r$ to be $0.2\times std$ . Where std stands for standard deviation which should be taken over a very large dataset. For instance, the r value of 6 ms is appropriate for sample entropy calculations of heart rate intervals, since this corresponds to $0.2\times std$ for a very large population.

Multiscale SampEn

The definition mentioned above is a special case of multi scale sampEn with $\delta =1$ , where $\delta$ is called skipping parameter. In multiscale SampEn template vectors are defined with a certain interval between its elements, specified by the value of $\delta$ . And modified template vector is defined as $X_{m,\delta }(i)={x_{i},x_{i+\delta },x_{i+2\times \delta },...,x_{i+(m-1)\times \delta }}$ and sampEn can be written as $SampEn\left(m,r,\delta \right)=-\ln {A_{\delta } \over B_{\delta }}$ And we calculate $A_{\delta }$ and $B_{\delta }$ like before.

Implementation

Sample entropy can be implemented easily in many different programming languages. Below lies an example written in Python.

fromitertoolsimportcombinationsfrommathimportlogdefconstruct_templates(timeseries_data:list,m:int=2):num_windows=len(timeseries_data)-m+1return[timeseries_data[x:x+m]forxinrange(0,num_windows)]defget_matches(templates:list,r:float):returnlen(list(filter(lambdax:is_match(x[0],x[1],r),combinations(templates,2))))defis_match(template_1:list,template_2:list,r:float):returnall([abs(x-y)<rfor(x,y)inzip(template_1,template_2)])defsample_entropy(timeseries_data:list,window_size:int,r:float):B=get_matches(construct_templates(timeseries_data,window_size),r)A=get_matches(construct_templates(timeseries_data,window_size+1),r)return-log(A/B)

An equivalent example in numerical Python.

importnumpydefconstruct_templates(timeseries_data,m):num_windows=len(timeseries_data)-m+1returnnumpy.array([timeseries_data[x:x+m]forxinrange(0,num_windows)])defget_matches(templates,r):returnlen(list(filter(lambdax:is_match(x[0],x[1],r),combinations(templates))))defcombinations(x):idx=numpy.stack(numpy.triu_indices(len(x),k=1),axis=-1)returnx[idx]defis_match(template_1,template_2,r):returnnumpy.all([abs(x-y)<rfor(x,y)inzip(template_1,template_2)])defsample_entropy(timeseries_data,window_size,r):B=get_matches(construct_templates(timeseries_data,window_size),r)A=get_matches(construct_templates(timeseries_data,window_size+1),r)return-numpy.log(A/B)

An example written in other languages can be found:

References

1 2 Richman, JS; Moorman, JR (2000). "Physiological time-series analysis using approximate entropy and sample entropy". American Journal of Physiology. Heart and Circulatory Physiology. 278 (6): H2039–49. doi:10.1152/ajpheart.2000.278.6.H2039. PMID 10843903.
↑ Delgado-Bonal, Alfonso; Marshak, Alexander (June 2019). "Approximate Entropy and Sample Entropy: A Comprehensive Tutorial". Entropy. 21 (6): 541. Bibcode:2019Entrp..21..541D. doi: 10.3390/e21060541 . PMC 7515030 . PMID 33267255.
↑ Grassberger, Peter; Procaccia, Itamar (1983). "Estimation of the Kolmogorov entropy from a chaotic signal". Physical Review A. 28 (4): 2591(R). doi:10.1103/PhysRevA.28.2591.
↑ Costa, Madalena; Goldberger, Ary; Peng, C.-K. (2005). "Multiscale entropy analysis of biological signals". Physical Review E. 71 (2): 021906. Bibcode:2005PhRvE..71b1906C. doi:10.1103/PhysRevE.71.021906. PMID 15783351.
↑ Błażkiewicz, Michalina; Kędziorek, Justyna; Hadamus, Anna (March 2021). "The Impact of Visual Input and Support Area Manipulation on Postural Control in Subjects after Osteoporotic Vertebral Fracture". Entropy. 23 (3): 375. Bibcode:2021Entrp..23..375B. doi: 10.3390/e23030375 . PMC 8004071 . PMID 33804770.
↑ Hadamus, Anna; Białoszewski, Dariusz; Błażkiewicz, Michalina; Kowalska, Aleksandra J.; Urbaniak, Edyta; Wydra, Kamil T.; Wiaderna, Karolina; Boratyński, Rafał; Kobza, Agnieszka; Marczyński, Wojciech (February 2021). "Assessment of the Effectiveness of Rehabilitation after Total Knee Replacement Surgery Using Sample Entropy and Classical Measures of Body Balance". Entropy. 23 (2): 164. Bibcode:2021Entrp..23..164H. doi: 10.3390/e23020164 . PMC 7911395 . PMID 33573057.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[richman2000-1] 1 2 Richman, JS; Moorman, JR (2000). "Physiological time-series analysis using approximate entropy and sample entropy". American Journal of Physiology. Heart and Circulatory Physiology. 278 (6): H2039–49. doi:10.1152/ajpheart.2000.278.6.H2039. PMID 10843903.

[2] Delgado-Bonal, Alfonso; Marshak, Alexander (June 2019). "Approximate Entropy and Sample Entropy: A Comprehensive Tutorial". Entropy. 21 (6): 541. Bibcode:2019Entrp..21..541D. doi: 10.3390/e21060541 . PMC 7515030 . PMID 33267255.

[3] Grassberger, Peter; Procaccia, Itamar (1983). "Estimation of the Kolmogorov entropy from a chaotic signal". Physical Review A. 28 (4): 2591(R). doi:10.1103/PhysRevA.28.2591.

[4] Costa, Madalena; Goldberger, Ary; Peng, C.-K. (2005). "Multiscale entropy analysis of biological signals". Physical Review E. 71 (2): 021906. Bibcode:2005PhRvE..71b1906C. doi:10.1103/PhysRevE.71.021906. PMID 15783351.

[5] Błażkiewicz, Michalina; Kędziorek, Justyna; Hadamus, Anna (March 2021). "The Impact of Visual Input and Support Area Manipulation on Postural Control in Subjects after Osteoporotic Vertebral Fracture". Entropy. 23 (3): 375. Bibcode:2021Entrp..23..375B. doi: 10.3390/e23030375 . PMC 8004071 . PMID 33804770.

[6] Hadamus, Anna; Białoszewski, Dariusz; Błażkiewicz, Michalina; Kowalska, Aleksandra J.; Urbaniak, Edyta; Wydra, Kamil T.; Wiaderna, Karolina; Boratyński, Rafał; Kobza, Agnieszka; Marczyński, Wojciech (February 2021). "Assessment of the Effectiveness of Rehabilitation after Total Knee Replacement Surgery Using Sample Entropy and Classical Measures of Body Balance". Entropy. 23 (2): 164. Bibcode:2021Entrp..23..164H. doi: 10.3390/e23020164 . PMC 7911395 . PMID 33573057.

[1]

[2]

[3]

[4]

[5]

[6]

Sample entropy

Contents

Definition

Multiscale SampEn

Implementation

See also

References