Removing eye activity from EEG via ICA

Python

MNE

EEG

EOG

Artifacts

ICA

EEG signals often contain eye activity (movement and/or blinks), which usually needs to be removed before performing EEG analysis. In this post, I show how to remove such ocular artifacts using independent component analysis (ICA).

Published

January 29, 2018

Modified

November 15, 2024

Background

In a previous post, we used linear regression to remove ocular artifacts from EEG signals. Independent component analysis (ICA) is a popular alternative to this approach. In a nutshell, ICA decomposes multi-channel EEG recordings into maximally independent components. Components that represent ocular activity can be identified and eliminated to reconstruct artifact-free EEG signals. This approach is described in more detail in Jung et al. (2000).

A comprehensive comparison between the two methods is beyond the scope of this post. Instead, I will only list some important properties of both techniques.

First, the regression-based approach requires EOG channels, whereas ICA works without any reference signal. This means you can use ICA to remove ocular activity even if your recording does not include EOG channels.

Both methods potentially remove brain activity in addition to ocular activity. ICA can separate ocular components from brain components well if many EEG channels are available (which in turn requires a relatively large number of data samples). A cleaner separation also means that less brain activity will be removed when ocular components are eliminated. The minimum number of EEG channels required for ICA decomposition varies, but as a rule of thumb, at least 20 channels seem to be necessary (the EEGLAB tutorial has more details on the amount of data required for ICA). In contrast, the regression approach works fine even if only a few EEG channels are available. However, the EOG reference channels always contain some amount of brain activity, which will also be removed from the data in addition to ocular activity.

The ICA method entails manual identification of ocular components, although several algorithms exist to automate this process (for example EyeCatch or ADJUST). ICA also takes longer to compute than the regression approach (but efficient implementations are available that keep computation time to a minimum). Finally, ICA is an optimization problem that is not guaranteed to find the global optimal solution. Depending on the initial conditions, the algorithm might find different independent components from run to run. However, this is not a huge issue in this application, because ocular components are relatively large and therefore stable across decompositions.

Let’s now turn to an example to see how ICA can be used to remove ocular artifacts with MNE.

Implementation

Before we start, it is worth mentioning that ICA will generally run faster using a multi-threaded numeric library such as OpenBLAS. If you install the official NumPy package (see this post for more details on how to install Python packages), you will get OpenBLAS support out of the box.

Data preprocessing

We will use the same data set that we already used with the regression approach. Specifically, we’ll use participant A01T from data set 001-2014 from the BNCI Horizon 2020 website (check out the post on the regression approach for more details on this data set). To recap, download this file and save it to your working directory. Note that this data set contains 22 EEG and 3 EOG channels. Although EOG channels can (and should) be used for ICA decomposition (provided that they use the same reference electrode as the EEG channels), we will only use EEG channels here to keep things simple.

As always, we start by importing the necessary packages:

from scipy.io import loadmat
import mne

The data comes as a MAT file, so we use scipy.io.loadmat() to load it as a NumPy array. Note that this time we only load the fourth run containing the actual experimental data – we do not need the calibration run.

mat = loadmat("A01T.mat", simplify_cells=True)
eeg = mat["data"][3]["X"] * 1e-6  # convert to volts

We will plot ICA components as projections on the scalp surface later on. To this end, MNE needs to know the channel labels, which unfortunately are not present in the data. However, the data description contains a picture of the montage, which we can use to populate a list of channel names:

ch_names = [
    "Fz",
    "FC3",
    "FC1",
    "FCz",
    "FC2",
    "FC4",
    "C5",
    "C3",
    "C1",
    "Cz",
    "C2",
    "C4",
    "C6",
    "CP3",
    "CP1",
    "CPz",
    "CP2",
    "CP4",
    "P1",
    "Pz",
    "P2",
    "POz",
    "EOG1",
    "EOG2",
    "EOG3",
]

We create an info object using this list, which we need to create a Raw object containing the EEG and associated meta data (which in our case is just the sampling frequency of 250 Hz and the channel types). Finally, we add a 10–20 montage, which maps the channel labels to their locations on the scalp (this is required for topographic plots).

info = mne.create_info(ch_names, 250, ch_types=["eeg"] * 22 + ["eog"] * 3)
raw = mne.io.RawArray(eeg.T, info)
raw.set_montage("easycap-M1")

Performing ICA

ICA does not work well in the presence of low-frequency drifts, so we create a copy of our raw object and apply a high-pass filter to this copy.

raw_tmp = raw.copy()
raw_tmp.filter(l_freq=1, h_freq=None)

Important

In the next step, we will use the Picard ICA algorithm, which is not included in MNE. Therefore, make sure to install it (see here for a quick overview of how to set up Python and install packages).

Picard is much faster than the default Extended Infomax implementation shipped with MNE, and it can compute the same solution by setting fit_params={"extended": True, "ortho": False}. Alternatively, if you prefer the FastICA solution, set fit_params={"extended": True, "ortho": True}. More details are available here.

We are now ready to perform ICA. First, we instantiate an ICA object and specify that we want to use the Picard algorithm by setting method="picard". In addition, the random_state argument should be set for reproducible results.

ica = mne.preprocessing.ICA(
    method="picard",
    fit_params={"extended": True, "ortho": False},
    random_state=1
)

Next, we fit ica to our filtered raw data raw_tmp (note that this uses only the 22 EEG channels and ignores the 3 EOG channels by default, but this could be changed with the picks argument).

ica.fit(raw_tmp)

Identifying ocular components

Our next task is to identify ocular components among the computed independent components. This is usually done by visual inspection, so we start by plotting scalp projections of all components:

ica.plot_components(inst=raw_tmp, picks=range(22))

Let’s focus on the first few components, because ocular components are generally found among these. From these scalp projections, the component labeled as ICA001 looks like it could represent eye movements due to its frontal location. To be sure, we can click on this component to open a new window with more details (this is possible because we specified inst=raw_tmp in the previous call):

Besides the scalp projection, we can now also see

the component power spectral density (which is typical for ocular activity, because the characteristic EEG alpha peak is missing),
the epochs image, which color-codes the component activity over (virtual) epochs (which shows typical intermittent activity as blue and red stripes),
and the epochs variance (which in this case is not really helpful in identifying the component).

In summary, we can be pretty sure that component ICA001 represents ocular activity. To be extra safe, let’s plot the component time courses to corroborate our assumption:

ica.plot_sources(inst=raw_tmp)

Indeed, if you scroll through the data, ICA001 does primarily capture eye movements and eye blinks.

Note that very often, two ocular components can be found in the decomposition, but this is not the case in our example data (all remaining components do not seem to originate from eye activity).

Removing ocular components

In the final step, we create a list attribute ica.exclude containing the indices of all components that should be removed when reconstructing EEG signals. In our case, this list contains only a single component:

ica.exclude = [1]

Note that you can click on the component title (ICA001) in the ICA components plot to include/exclude a component (the title of an excluded component will turn gray). This will also add/remove this component from/to the underlying ica.exclude attribute.

Now we can apply our ICA decomposition (without the excluded component) to a copy of the original (unfiltered) EEG to obtain artifact-free signals:

raw_corrected = raw.copy()
ica.apply(raw_corrected)

Visualizing results

So how did ICA perform? Let’s take a look at a segment of the original EEG containing a clear eye movement artifact:

raw.plot(n_channels=25, start=53, duration=5)

And here is the corrected signal:

raw_corrected.plot(n_channels=25, start=53, duration=5)

Looks like ICA did a pretty decent job in removing eye artifacts.

Code

from scipy.io import loadmat
import mne

mat = loadmat("A01T.mat", simplify_cells=True)

eeg = mat["data"][3]["X"] * 1e-6  # convert to volts

ch_names = [
    "Fz",
    "FC3",
    "FC1",
    "FCz",
    "FC2",
    "FC4",
    "C5",
    "C3",
    "C1",
    "Cz",
    "C2",
    "C4",
    "C6",
    "CP3",
    "CP1",
    "CPz",
    "CP2",
    "CP4",
    "P1",
    "Pz",
    "P2",
    "POz",
    "EOG1",
    "EOG2",
    "EOG3",
]

info = mne.create_info(ch_names, 250, ch_types=["eeg"] * 22 + ["eog"] * 3)
raw = mne.io.RawArray(eeg.T, info)
raw.set_montage("easycap-M1")

raw_tmp = raw.copy()
raw_tmp.filter(l_freq=1, h_freq=None)
ica = mne.preprocessing.ICA(
    method="picard",
    fit_params={"extended": True, "ortho": False},
    random_state=1
)
ica.fit(raw_tmp)

ica.plot_components(inst=raw_tmp, picks=range(22))
ica.plot_sources(inst=raw_tmp)
ica.exclude = [1]

raw_corrected = raw.copy()
ica.apply(raw_corrected)

raw.plot(n_channels=25, start=53, duration=5, title="Before")
raw_corrected.plot(n_channels=25, start=53, duration=5, title="After")