Accepted date: June 20, 2016
The selection of the optimal feature of EEG signals is important for the discrimination of mental tasks in brain-computer interface (BCI) research. This research presents a new technique for feature extraction of EEG signals sampled from subject executing left and right hand motor imagery (MI) using Hilbert- Huang Entropy (HHE). In our method, the raw signal is analysed with an elliptical band-pass filter and Hilbert-Huang Transform (HHT). The marginal spectra of beta and mu bands are the interesting features calculated from the Hilbert-Huang spectrum of the selected Intrinsic Mode Functions (IMF) of the filtered EEG signal. The Shannon entropy (SE) is then utilized within the framework of the HHT algorithm. The formed feature vector calculated by the SE transform is utilized to train a support vector machine (SVM) classifier for classification. The performance of the new method is compared to the HHT algorithm, which indicates the HHE algorithm is promising for BCI classification.
Brain-computer interface (BCI), Motor imagery (MI), Hilbert-Huang entropy (HHE), Intrinsic mode functions (IMF), Shannon entropy (SE), Support vector machine (SVM).
Brain-computer interface (BCI) is an emergent multidisciplinary technology that provides a new channel of communication between brain and environment, which enables a subject with neuromuscular disabilities to use the electroencephalography (EEG) to communicate and control devices without any peripheral muscle activity [1-2]. For this type of BCI based on motor imagery, the EEG signal is a kind of spontaneous signals measured through electrodes placed on the surface of the scalp. More and more evidences indicate that the kinaesthetic imagination of actual body movement can result in the underlying neurophysiological phenomena, which is termed the event related synchronization (ERS)/ desynchronization (ERD) in the sensorimotor cortex . And the corresponding EEG signals, especially in the μ (8-13 Hz) and β (14-25 Hz) frequency band, have been used as good signal features to recognize some mental tasks for motor imagery-based BCI.
BCI can be divided into different steps: the acquisition of EEG signals, the signal pre-processing, the feature extraction, the classification and the device controller. The key problem in motor imagery-based BCI is how to extract the features which exactly represent different mental tasks from the EEG signals. In recent years, many EEG feature extraction algorithms have been used in BCI applications: power spectral density (PSD) , short time Fourier transform (STFT) , autoregressive (AR) model  and wavelet transform (WT) . PSD is well known, which could give any power spectrum but insensitive to nonlinear structure contained in time series.
STFT is a powerful tool for extracting features from EEG data in time domain and frequency domain. But it is ineluctable to weigh the advantage and disadvantage of time and frequency resolutions, and the estimation of frequencies is sensitive to noise. The AR model has the advantages of spectral estimation and signal modelling, but the model order has to be selected via order selection criteria, and different model orders can give inconsistent results. The WT extended from STFT capture transient features of signals and localize them in both time and frequency domain limited by fundamental uncertainly principle, and also difficult to select an appropriate wavelet and decomposition level.
More recently, termed Hilbert-Huang transform (HHT), an emerging technique for analysing nonlinear and non-stationary signals, has been utilized to analyse biomedical signals, such as ECG de-noise , detecting seizure from EEG signal , extraction of schizophrenic EEG synchrony , and specially involving analysis in steady-state evoked potential (SSEP)  and evoked potentials (P300) . A preferable application of the HHT algorithm for discrimination of mental tasks is proposed in this research. As an algorithm of time-frequency analysis, HHT can produce physically meaningful representations of signal both in time and frequency domain. The core of this algorithm to decompose signal is datadependent and posteriori-defined, and the inner scales of the decomposed signal are great adapted for EEG signal processing. HHT is composed of empirical mode decomposition (EMD) and Hilbert spectral analysis (HSA), which intuitively decomposes original signal into a set of symmetric intrinsic mode functions (IMFs) amplitude and frequency modulated . Shannon entropy (SE) is a measure derived from the original definition suggested by Shannon , who defined entropy as the average amount of information of a probability distribution. It has been widely used for analysing non-stationary signals. Moreover, this technique is a useful tool for quantifying the global regularity of EEG signal [15-17]. Martis et al. used the selected spectral entropy and spectral energy of IMFs to automatically diagnose seizure. Hemalatha predicted and detected seizure from EEG signal with continuous wavelet entropy. Ni and Wang employed Shannon entropy to detect the deepening features of anaesthesia degree. In this work, SE combined with HHT algorithm, is used to detect the features of regular contained in EEG signal during implementing mental tasks which will be used as feature vector for classifier. The proposed method can not only analyse EEG signals by data-dependent and posteriori-defined HHT which is not such as WT with a predetermined appropriate wavelet, but also utilize SE to reveal the irregular changes of EEG features during implementing mental tasks.
In general, purpose of the research in this paper is to examine the ability of SE method combined with HHT technique to extract features from EEG signals in discriminating mental tasks. In this work, HHT has been applied for acquiring the marginal spectrum by analysing the instantaneous timefrequency information of mental task data. And the frequency band (830 Hz) involving μ and β rhythm of acquired marginal spectrum is used to compute the SE feature. Moreover, in order to obtain the SE features of different period and time interval, the sliding window technique is applied to segment the original EEG signal. The obtained SE feature vector is used by the RBF-SVM (radial basis function kernel) classifier for discrimination of mental tasks.
This work proposes a new technique based on time-frequency informationwhich utilized both HHT and SVM to identify mental tasks. HHT has been used to acquire the time-frequency features of EEG signals. And then SE has been applied to the time-frequency features which are segmented corresponding to the μ and β rhythm of EEG signal based on the sliding window technique, which extracts the SE features, named as Hilbert- Huang entropy (HHE). Finally, RBF-SVM classifier is applied to the classification of HHE features. The flow diagram of this work is given in Figure 1.
Hilbert-Huang transform is a data-dependent and posterioridefined signal analysis algorithm. Comparing with other timefrequency analysis methods, HHT has an adaptive ability to track the evolution of time-frequency basis in the original signal without employing a time or frequency resolution window, and also can provide much detailed information at discretionary time-frequency scales .
Empirical mode decomposition
As a data-dependent algorithm, EMD could decompose a nonlinear and non-stationary signal into a set of IMFs which are band limited. The obtained IMFs must satisfy two conditions :
1. In the whole data set, the number of extremes and zero crossings must either equal or differ no more than one;
2. At any point, mean value of the envelope defined by local maxima and local minima is zero.
The EMD algorithm for decomposing a signal x (t) based on a sifting process can be explained by the following steps:
1. Detect the local maxima and minima of signal x (t);
2. Obtain the upper Xup (t) and lower Xlow (t) envelope defined by local maxima and minima separately;
3. Calculate the mean of both envelopes, designated as m(t);
5. Check whether h(t) satisfies the two above conditions for IMF or not. If the conditions are satisfied , set ; Or is treated as the new data, and the steps 1-4 are repeated;
6. Iterate steps 1-5 to obtain all IMFs of the signal until the residual r(t) is a monotonic function.
Finally, the signal x (t) can be summed as follows:
Where n, is the number of all IMFs, and rn (t) is the final residual which can be either the mean trend or a constant.
Hilbert spectral analysis
Hilbert transform is employed to analyse every IMF for instantaneous frequencies ωi (t) and amplitudes ai (t) after calculating all IMFs though EMD method. For any signal x (t), its Hilbert transform X̂ (t) is
Where, is the Cauchy principal value, z (t) is the analytic signal, and φ (t) is the instantaneous phase
And then the instantaneous amplitude a (t) and the instantaneous frequency φ (t) are computed
Therefore, the Hilbert-Huang spectrum H (ω, t) and the local marginal spectrum h (ω) are represented as
Features extracted based on Hilbert-Huang Entropy
From previous subsection, the HHT has been fully described to obtain the Hilbert-Huang spectrum H (ω, t) which provides time-frequency distribution of the processed signal . Moreover, the marginal spectrum h (ω) provides a measure of all amplitude (or energy) contribution from the interesting frequency. Therefore, local marginal spectrum h (ω) of frequency band (830 Hz) involving μ and β rhythm is selected to calculate the features of mean μm and variance σ2:
However, these traditional features of mean μm and variance σ2 cannot satisfy to describe the information of EEG signal during motor imagery. Therefore, this work employs a new feature, named as Hilbert-Huang entropy (HHE), which is calculated by Shannon entropy based on local marginal spectrum. Modified from the definition of Shannon entropy , HHE is defined as
For signal processing, Shannon entropy reveals the irregular changes in signal properties like spectrum and amplitude distribution. Therefore, HHE can display the irregular level of local marginal spectrum distribution at a specific time, which is selected as the feature of EEG signal to identify different mental tasks. In order to optimize the HHE, the sliding time window technique is employed to segment the original EEG signal.
Classification based on support vector machine
In this work, SVM classifier is selected as the classification algorithm to classify the extracted feature vectors of EEG signals. As a machine learning method with statistical learning theory , SVM has an excellent performance on solving the model identification problem of small sample, nonlinear and high dimensional. The key of SVM algorithm is to find a hyper-plane and maximize the margin to separate the training data xi with labels yi. The constraint is shown as follows:
Where, w is the weighting vector, and b is the bias term.
So as to obtain the optical hyper-plane, SVM algorithm has the ability to tolerate misclassification:
Where, ξ is slack variable, and C is a positive real weighting constant.
In addition, the choice of the kernel functions has also an important influence on SVM classification. In this work, radial basis kernel, also named radial basis function (RBF), is selected:
Where, g controls the width of RBF kernel.
Finally, the features extracted are sent by HHE algorithm into SVM classifier to obtain the identification result.
The data utilized in this work is the Dataset III of BCI 2003 competition provided by University of Technology Graz in Austria. The dataset has two sets which are the training dataset x_train and the testing dataset x_test. Each one has 140 threechannel (C3, Cz and C4) trials, the duration of every trial is 9 s, and the sampling frequency is 128 Hz. All EEG signals were taken from one healthy volunteer (female, 25 y) with imagining left or right hand movements. Moreover, y_train contains the class labels ‘1’, ‘2’ for left and right hand imagery movements of the training dataset. More detailed description of the dataset is shown in the manuscript . The EEG signals of imagining left and right hand movements are shown in Figure 2.
This work employs the EMD method to decompose the dataset into a set of IMFs. Figure 3 displays the decomposition in several IMFs of channel C3 and C4 EEG signals for right hand motor imagery, which shows the IMFs are ordered by frequencies. In order to select the appropriate IMFs, all IMFs have been transformed into frequency domain by power spectrum estimation, and finally the first two IMFs are applied to further analysis using Hilbert transform.
As demonstrated above, HHT is great adapted to processing nonlinear and non-stationary signals. Moreover, the Hilbert- Huang spectrum (HHS) can represent the time-frequency distribution of signals. With a comparison made between the two Hilbert-Huang spectrums of channel C3 and C4 EEG signals for left hand motor imagery, the energy distribution of all two channel signals is different in different time and frequency, and both under 30 Hz which matches the μ and β bands, which leads us to analyse the specific frequency band at certain duration.
The HHS gives us the time-frequency distribution of EEG signals. Next, the local marginal spectrums of channel C3 and C4 signals for left and right hand MI are calculated according to Equation 10 which can represent total amplitude (or energy) contribution from the interested frequencies. The difference of the local marginal spectrums of channel C3 and C4 EEG signals of left and right hand MI is represented in Figure 4.
According to the difference of the local marginal spectrums of EEG signals, the frequency band of the biggest energy difference in two motor imagery tasks and the statistical features of mean μm and variance σ2 are selected and calculated. What’s more, this work employs a new feature, Hilbert-Huang entropy (HHE), to represent the difference of EEG signals of two MI tasks.
Previously, the local marginal spectrum of the EEG signals of motor imagery is changing in different period and time interval. Therefore, this work applies time sliding window technique to segment the original EEG signals for calculating the Hilbert-Huang entropy features. Finally, the EEG data from 3.5 to 6.5 s is selected by setting a time window of 3 seconds and the step is 0.5 s at channel C3 and C4, and the HHE features are calculated and form the final feature vector for classification.
SVM algorithm with the RBF kernel function is applied to the distinction of the extracted HHE features. Before classification, it is significant to find the best optimal parameters of c and g which belong to the RBF-SVM. In this research, genetic algorithm (GA)  is utilized to select the best c and g. Meanwhile, the maximal evolution algebras is 100 and the maximal populations is 20, and 5-fold cross validation (CV). Figure 5 shows the parameter optimization.
While the features of the training dataset are used, the best accuracy is 85%. The optimal parameters are used for classification of the training dataset and the testing dataset and the classification results are 85.57% and 85.0% respectively. The maximum classification accuracies obtained by the features of mean μm and variance σ2 of the local marginal spectrum are compared with the HHE features in Table 1.
|Features||Training dataset||Testing dataset|
Table 1: The classification results of HHE feature compared with μm and σ2.
Moreover, the identification result of the RBF-SVM classifier is also evaluated by sensitivity and specificity defined as 
Where TP and TN are all the number of correctly detected true left hand tasks and true right hand tasks. The FP and FN represent the total number of false left hand tasks and false right hand tasks. Figure 6 shows the identification result of test trails utilizing RBF-SVM classifier.
Table 2 shows the sensitivity and specificity for test trails for three features of two mental tasks.
Table 2: The sensitivity and specificity of RBF-SVM for three features of test trails.
The receiver operating characteristics (ROC) curve is a comprehensive measure to reveal the relationship of the sensitivity and specificity, and drawn by sensitivity (y-axis) and 1-specificity (x-axis). A series of sensitivity and specificity are calculated by setting out a number of different thresholds of continuous variations. The larger the area under ROC curve, the better of the classification accuracy. The ROC curves of RBF-SVM for different features are shown in Figure 7. It is clear that the ROC area of HHE feature is the largest which indicates the HHE is the best method for classification of two mental tasks in.
In this work, HHT algorithm is utilized to analyse the EEG signals for left and right hand MI. Firstly, the EMD method decomposes EEG signals into IMFs which are in decreasing order of frequency, and then the selected IMFs which fit the needed frequency band are used for Hilbert-Huang spectrum which displays the time-frequency distribution of EEG signals.
In general, the marginal spectrum h (ω) provides a measure of all amplitude (or energy) contribution from the interesting frequency. Therefore, many researchers obtain the features of EEG signals from the local marginal spectrum. However, the traditional features are usually statistical features which are not sufficient to describe the inner information of EEG signals. Shannon entropy (SE) reveals the irregular changes in signal properties like spectrum and amplitude distribution. Therefore, in this work, the SE method with HHT technique is used to extract the features of EEG signals in left and right hand motor imagery. So as to obtain the optimal HHE features, the time sliding window technique is used to segment the original EEG signals for calculating HHE features and the RBF-SVM algorithm with genetic algorithm (GA) is applied to distinguish the extracted HHE features. Finally, this work not only analyses the classification accuracy, but also evaluates the identification performance of RBF-SVM classifier by ROC curve.
From above results, the classification performance of HHE features proposed in this work is higher than the statistical features of mean μm and variance σ2 of the local marginal spectrum. The classification accuracy reaches the BCI competition classification accuracy level (the highest is 89.29%, the second is 84.29% and the third is 82.86% of BCI 2003 competition), and then the ROC area of HHE feature is the largest. Therefore, the proposed method for extracting EEG features of motor imagery is efficient and also can be utilized for processing other similar signals.
This work proposes a new EEG feature extracting method combining HHT and SE. The HHT has been utilized for EEG signals of two MI tasks for obtaining the time-frequency distribution. The time sliding window technique has been employed to segment the original EEG signals and acquire the local marginal spectrum at different time interval. The interested part of the local marginal spectrum is selected based on the μ and β rhythm to calculate the HHE features by SE algorithm. The HHE features of EEG signals from 3.5 to 6.5 s are feature vectors for RBF-SVM classifier which was trained to obtain the optimal parameters with GA algorithm. The identification results indicate that the proposed method in this work is feasible for extracting EEG features of motor imagery. The future research direction should be application of EEG feature extraction of BCI such as controlling a wheelchair so as to help those people with severe physical disabilities but cognitively intact.
This study has received kind and generous support from the Na tional Natural Science Foundation of China (61305147) and th e Natural Science Foundation of HeNan Province (1223004101 20, 13A416858).