This presentation is about such well-known vulnerabilities as Meltdown and Spectre and the way they use imperfections of modern processors on an architectural level. In this regard, ARM architecture, which is now a standard in embedded system, is discussed.
The talk was delivered by Andrii Lukin (Senior Software Engineer, Consultant, GlobalLogic) at GlobalLogic Embedded Career Day #2 on February 10, 2018.
More about GlobalLogic Embedded Career Day #2: https://www.globallogic.com/ua/events/globallogic-kyiv-embedded-career-day-2-materials
3. 3
RISC -- Reduced Instructions Set Computer
● Small set of simple and general instructions
● Fixed length instructions
● Simpler processor’s core logic
● Harvard architecture -- architecture with physically separate storage and
signal pathways for instructions and data
● Load/Store architecture -- separate instructions for memory access
● A lot of general purpose registers or even register files
12. 12
Exceptions
● A synchronous exception if it is generated as a result of execution or attempted
execution of the instruction stream, and where the return address provides
details of the instruction that caused it.
● An asynchronous exception is not generated by executing instructions, while the
return address might not always provide details of what caused the exception.
● In the ARMv7-A architecture, the prefetch abort, Data Abort and undef
exceptions are separate items.
● In AArch64, all of these events generate a Synchronous abort. The exception
handler may then read the syndrome and FAR registers to obtain the necessary
information to distinguish between them.
19. 19
MMU - Caches
● Point of Coherency (PoC) -- is the point at which all observers, for example,
cores, DSPs, or DMA engines, that can access memory, are guaranteed to
see the same copy of a memory location. Typically, this is the main external
system memory.
● Point of Unification (PoU) -- is the point at which the instruction and data
caches and translation table walks of the core are guaranteed to see the
same copy of a memory location
21. 21
MMU - Normal memory
● Normal memory -- The processor can re-order, repeat, and merge accesses
to it.
Furthermore, address locations that are marked as Normal can be accessed
speculatively by the processor, so that data or instructions can be read from
memory without being explicitly referenced in the program, or in advance of
the actual execution of an explicit reference. Such speculative accesses can
occur as a result of branch prediction, speculative cache linefills, out-of-order
data loads, or other hardware optimizations.
22. 22
MMU - Device memory
● Device memory --
○ Device-nGnRnE most restrictive (equivalent to Strongly Ordered
memory in the ARMv7 architecture).
○ Device-nGnRE
○ Device-nGRE
○ Device-GRE least restrictive
● Gathering of non Gathering (G or nG) -- whether multiple accesses can be
merged into a single bus transaction for this memory region.
● Re-ordering (R or nR) -- whether accesses to the same device can be
re-ordered with respect to each other.
● Early Write Acknowledgement (E or nE) -- whether an intermediate write
buffer between the processor and the slave device being accessed is allowed
to send an acknowledgement of a write completion
32. 32
Conspiracy theory
Intel:
Vulnerability is there for 10 to 20 YEARS
But “Flush+Reload” are known from 2014 at least
ARM:
Vulnerability is introduced in latest most powerful designs
39. 39
Variant 3: using speculative reads of inaccessible data
The perturbation of the cache by the LDR X5, [X6,X3] (line 7) can be subsequently measured by the EL0 code for
different values of the shift amount imm (line 5). This gives a mechanism to establish the value of the EL1 data at
the address pointed to by X4,so leaking data that should not be accessible to EL0 code.
1 LDR X1, [X2] ; arranged to miss in the cache
2 CBZ X1, over ; This will be taken but
3 ; is predicted not taken
4 LDR X3, [X4] ; X4 points to some EL1 memory
5 LSL X3, X3, #imm
6 AND X3, X3, #0xFC0
7 LDR X5, [X6,X3] ; X6 is an EL0 base address
8 over
40. 40
Variant 3a: using speculative reads of system registers
In much the same way as with the main Variant 3, in a small number of Arm implementations, a processor that
speculatively performs a read of a system register that is not accessible at the current exception level, will actually
access the associated system register (provided that it is a register that can be read without side-effects).
1 LDR X1, [X2] ; arranged to miss in the cache
2 CBZ X1, over ; This will be taken
3 MRS X3, TTBR0_EL1;
4 LSL X3, X3, #imm
5 AND X3, X3, #0xFC0
6 LDR X5, [X6,X3] ; X6 is an EL0 base address
7 over
Can be used to read crypto keys from system registers if ARM Pointer authentication
feature used (ARMv8.3)