Memory testing: memtester
Before putting hardware into operation, you might wish to test its functionality. For RAM, we recommend you use the memtester diagnostic tool. You may also find other tools, such as memtest86+. The advantage of using memtester lies in the fact it can—in conjunction with other tools—provide detailed information on any errors.
Boot Ubuntu from a flash disc
Create a booting Ubuntu flash disc. You can use, e.g., unetbootin
, or download Ubuntu Desktop and copy it to a flash disc using dd
.
Use the flash disc to boot and select 'Try Ubuntu'.
Install necessary packages
First, you need to edit the list of repositories and update the index.
- Use an editor to open sources.list:
editor /etc/apt/sources.list
- Delete the
deb cdrom:[...]
line - At the end of the line, where the name of the current Ubuntu edition is indicated (i.e., xenial, yakkety etc.), add 'universe'
- Refresh index:
apt-get update
- Install packages using:
apt-get install mcelog openssh-server memtester edac-utils
In addition to memtester, packages mcelog (keeps a log of any determined hardware errors) and edac-utils (detecting ECC failures and other errors) will be installed on your machine. An open ssh-server must be used to ensure access via SSH – benefits are described below.
Please check to see that the mcelog is running and the log is ok: systemctl status mcelog
.
If errors such as "mcelog: Warning: cpu 0 offline?, imc_log not set" occur, load the msr kernel module: modprobe msr && systemctl restart mcelog
(see
1,
2).
Log in to the tested machine via SSH
If you log in to the tested machine via SSH, you will be able to see outputs on the console even if a critical failure occurs.
To log in as root, you must set a root password and enable it (i.e., in the /etc/ssh/sshd_config
file change the 'PermitRootLogin' option to 'yes'). You may do so using the following commands:
sudo passwd
sed -i 's/^\(PermitRootLogin\).*/\1 yes/' /etc/ssh/sshd_config
systemctl reload sshd
Log in to the machine via SSH. Tip: if you do not have another computer on you at the time, you can use SSH and log in from the tested machine to another computer for which you have access rights. On the other computer, run screen (terminal emulator) and use it to log in back to your tested machine. If anything goes wrong, you can, at any time from any location, log in to the machine running screen, run screen -r
, and return to your virtual screen. You may run screen with the –L option, which tells screen to turn on automatic output logging. The log file is named screenlog.0.
Stop all processes that do not necessarily need to run to free up memory
Warning! If you work on the tested machine locally, first switch into the text console, because init 3
will turn the graphic environment off. You may do so by pressing Ctrl+Alt+F2 or entering the sudo chvt 2
command.
init 3
systemctl stop avahi-daemon cups-browsed [...]
echo 3 > /proc/sys/vm/drop_caches
Memory test
First determine how much memory is available to you:
free -h
, the column Available.
Another option is to try to run memtester with a greater amount of memory than you can get and look up the available amount on the list ('want' refers to the amount required, 'got' to the actual amount available).
Run testing. We recommend not using all the available memory but keep some free (approximately 1 GB):
date; memtester 499G; date
Please note the testing is an infinite process and will run until you stop it. Whenever appropriate, you can limit it to a particular number of iterations (using the memtester 499G 3
command, you will limit it to 3 iterations).
Any errors will be listed by memtester or logged in the console by mcelog. When the run is complete, you can check the outputs of the edac-util
and dmesg
commands.