YuHoyoung1
ChoMannhee2
LeeSangil3
LeeHyung-Min3*
KimYoungmin2†
-
(School of Comp. & Info. Engineering, Kwangwoon University, Kwangwoon-ro 20, Nowon-gu,
Seoul 01897, Korea)
-
(School of Electronic and Electrical Engineering, Hongik University, Wausan-ro 94,
Mapo-gu, Seoul 04066, Korea)
-
(School of Electrical Engineering, Korea University, Seoul 02841, Korea)
Copyright © The Institute of Electronics and Information Engineers(IEIE)
Index Terms
FPGA reverse engineering, non-invasive attack, bitstream, logic extract, vivado design suite, project X-ray
I. INTRODUCTION
In the early days of their development, field-programmable gate arrays (FPGAs) had
limited use for testing early models of application-specific integrated circuits in
areas such as aerospace engineering and space industry. However, the use of FPGAs
has increased and diversified in various fields owing to the advantages of easy circuit
development and continuous circuit updates. In particular, because FPGAs are key components
in Internet of Things (IoT) devices, which constitute the most representative industry
in the fourth industrial revolution era, the demand for FPGAs is growing exponentially.
In addition, parallel data processing is possible with FPGAs. This feature is well-suited
for use in image processing and artificial intelligence environments. However, owing
to the rapid growth in demand for FPGAs, issues related to FPGA security have also
become increasingly prominent.
FPGAs have the advantage of being reprogrammable, which facilitates easy hardware
updates. Therefore, most FPGAs are based on static random-access memory (SRAM), and
the programmable read-only memory (PROM) must be installed outside the FPGA to store
FPGA configuration data (bitstreams). When an FPGA is powered on, the bitstream stored
in the PROM is downloaded to the FPGA. This bitstream can then be extracted from the
download path using a signal measuring instrument such as a logic analyzer. The reverse
engineering approach of extracting bitstream files from external memory is a non-invasive
approach that does not damage the FPGA. Therefore, this method may be widely used
in fields that require non-invasive approaches to analyzing target hardware, such
as criminal investigation, military defense, and security. Recently, manufacturers
of FPGAs have offered new ways to store bitstreams via complex programmable logic
devices or microprocessors, avoiding the need for PROMs (1).
An attacker is able to launch a Trojan attack that inserts malicious circuitry inside
the FPGA through extracted bitstreams. The user’s personal information can then be
leaked through these malicious circuits. To prevent attacks via bitstreams, various
FPGA manufacturers provide bitstream encryption methods such as AES-256 encryption.
However, it is essential to identify the root cause of the attacks and take appropriate
countermeasures because various methods for extracting encryption keys are being actively
researched (2). The FPGA reverse engineering method proposed in this study assumes that a bitstream
is not encrypted, and this study can be extended to research on FPGA security.
Xilinx is the biggest FPGA manufacturer, which accounts for more than 60% of the market
share. We aimed to perform reverse engineering on the most representative FPGA from
Xilinx, which offers two types of design compilers: ISE design suite and Vivado design
suite. The ISE design suite only supports families before the 7-series, whereas Vivado
supports recent products such as 7-series families.
The reverse engineering of Xilinx FPGAs can be divided into programmable interconnect
point (PIP) and programmable logic point (PLP) extraction. PIPs are used to interconnect
input/output blocks (IOBs), configurable logic blocks (CLBs), and various tiles using
programmable complementary metal-oxide semicon-ductor (CMOS) transistor switches.
They are primarily configured inside interconnect (INT) tiles. PLPs are used to implement
logic circuits inside CLBs. The interior of a CLB is composed of a lookup table (LUT),
multiplexer (MUX), and carry module. Various logic circuits can be implemented according
to LUT input combinations.
In the Xilinx ISE design suite, the Xilinx design language (XDL) is used to represent
netlists in text format. Reverse engineering has been performed on families before
the 7-series based on a database that analyzes the correlation between bitstreams
and the XDL (3-14). Initially, reverse engineering was performed only on the correlations between the
XDL and bitstreams, but this approach had very low accuracy because the XDL only includes
information regarding programmable points and contains few correlated data points
(12). To expand on a previous study (13), XDLRC (XDL report) files were added. These files record the structural information
for all FPGA internal circuits as correlation data, leading to higher accuracy. However,
reverse engineering was conducted only on the PIP information of INT tiles without
considering the technical characteristics of XDLRC structures. Recent papers have
reported the complete reverse engineering of FPGAs in ISE environment (14-16).
The Xilinx Vivado design suite environment, which supports the newest FPGA families,
does not provide XDL or XDLRC generation functions for security reasons, implying
that the previous reverse engineering method used in the ISE design suite environment
is not applicable to the Vivado environment. Therefore, a new method is required to
reverse engineer FPGAs in Vivado environment. As no existing reference information
or database is provided to compare the compiled design with the bitstream, unlike
the ISE environment, which provided XDL and XLDRC generation functions, it is essential
to analyze the bitstream structure of FPGAs in Vivado environment in detail. Patterns
in bitstream generation and bitstream address of data used for configuring each FPGA
element are required for reverse engineering.
Jeong et al. (17) succeeded in extracting a single LUT by locating LUT configuration bits. Simple logic
function consisting a single LUT was implemented and generated into bitstream using
Vivado design suite. The design would not utilize other elements of the FPGA, and
thus the generated bitstream would contain minimal data: '1' and '0' for active bits,
'0' for most other bits. By locating active bits in the bitstream, it was possible
to determine which parts of the bitstream are used to configure the LUT used in the
design. Then, by comparing the bitstream generated from different logic functions
and analyzing how the values change, it was possible to find a pattern in which LUT
configuration bits are encoded. The extracted bits would be used to write a truth
table which shows the output values for all each input values - the logic implemented
in the LUT. As result of experiments, it was proven that extracting LUT logic directly
from bitstream is possible. However, owing to the failure in obtaining PIP information,
which implies the connection of signals across multiple LUTs, it was impossible to
expand this method to reverse engineer a complex circuit consisting of two or more
LUTs with connected inputs.
Fig. 1. Flowchart of proposed FPGA reverse engineering.
In our study, to extract PLP information, we created various sample designs and obtained
the address of PLPs in the bitstream. The extracted bits were assigned to the outputs
of a truth table based on the LUT structure to acquire the logic function of the circuit.
PIP information was extracted based on the X-ray project (18), which provides PIP information on FPGAs in Vivado environment. In this study, we
also propose a method that can combine multiple bits of PLP and PIP information to
extract a complete circuit. With the proposed method, conventional research on extracting
logic from a single LUT can be extended to extracting logic from multiple LUTs. The
proposed reverse engineering flow for a Xilinx FPGA in Vivado environment is presented
in Fig. 1.
The first step is to extract the bitstream stored in the PROM of the FPGA board. As
soon as power is applied to the FPGA, the bitstream is downloaded from the PROM to
the FPGA. Data are extracted using a data measurement instrument during download.
Next, the extracted bitstream is used to identify the FPGA chip. Because FPGAs have
slightly different structures among series, it is necessary to apply different reverse
engineering algorithms based on chip identification. In this study, reverse engineering
was performed on the Artix-7 Family xc7a50tfgg484-1 chip. Thereafter, the PIP and
PLP are extracted separately, and one type of logic is extracted by combining the
two types of programmable points (PIP and PLP).
The main contributions of our study are summarized as follows.
·We verified our reverse engineering method for the Xilinx 7-series Artix-7 Family
xc7a50tfgg484-1 chip on Vivado design suite version 18.2.
·We proposed a reverse engineering method for FPGAs in Vivado environment that combines
both PLP and PIP information to extract the complete logic composed of multiple LUTs.
·We provided guidelines on extracting PLP and PIP information from the bitstream for
FPGAs in Vivado 7-series.
·We verified through experiment that our proposed method can extract the complete
logic composed of two LUTs.
Fig. 2. Structure of Xilinx ARTIX-7 family xc7a50tfgg484-1 chip.
II. Background
The structure of a Xilinx FPGA is presented in Fig. 2. The clock region consists of INT, IOB, CLB, BRAM, and DSP tiles. The INT tiles are
used to connect various tiles, such as IOB and CLB, using internal CMOS transistor
switches. The CLB tile is divided into two slices, each containing LUT, MUX, and flip-flop
(FF) modules, in which each clock region contains a total of 600 CLBs.
A CLB’s internal LUT can generate a variety of logical expressions using six inputs
controlled by a total of 64 bits. In the Vivado environment, a user can select and
use a desired LUT. A single LUT can be used to create bitstreams continuously to create
a sample bitstream group. The positions of the 64 bits that make up each LUT can be
identified based on the active bitstream position.
1. Bitstream Structure
The bitstream structures (3) of all members of the Xilinx FPGA series are the same. A bitstream is divided into
a header, command frame, configuration data, and cyclic redundancy check (CRC). The
header contains information such as the chip name of the target FPGA, project name,
creation date, and bitstream size. The command frame contains information such as
a frame address register and IDCODE, which indicates the start position of the configuration
data. Every FPGA is assigned a unique IDCODE, which identifies the type of FPGA. The
configuration data determine the configuration of the FPGA’s internal circuitry. Depending
on the configuration data values, various circuits can be designed within an FPGA.
Finally, the CRC determines whether the stored bitstreams from the PROM have been
successfully downloaded to the FPGA. The CRC function can be activated or deactivated.
In this study, the CRC function was deactivated.
When a real bitstream is downloaded to an FPGA, the header data are deleted, and the
download starts at the command frame. For the FPGA reverse engineering proposed in
this study, the command frame and CRC information are not needed. Therefore, only
configuration data are required.
Fig. 3. FPGA index database.
2. Project X-Ray
Project X-Ray is a reverse engineering project that analyzes Xilinx’s 7-series FPGAs
and is a follow-up project to Project IceStorm (19,20), in which reverse engineering was performed on Lattice’s iCE40 FPGA. Because XDL
and XDLRC files are not provided by the Vivado design suite, analysis using only bitstreams
is necessary. Project X-Ray uses Vivado design suite to generate a large number of
designs targeting 7-series FPGAs, performing similar to fuzz testing. Each design
has different elements within the FPGA enabled or disabled, and the generated bitstream
will have different values in addresses configuring each elements of the FPGA. By
comparing the large group of designs and the generated bitstreams, Project X-Ray can
build a database listing the element-bitstream correlation. The project is currently
under development and available as open-source code, allowing for the collection of
various device data from multiple users (21).
The basic flow of Project X-Ray toolchain is as follows. First, one creates sample
designs that utilize specific devices in the target FPGA (Artix-7, Kintex-7, and Zynq-7
families). Next, the Xilinx Vivado design suite is used to output these designs as
a bitstream. The output bitstream travels through the Project X-Ray toolchain, which
outputs FPGA documentation data, containing information about the active elements
within the design, and its graphical representation in the form of an HTML file. The
FPGA documentation data is then uploaded to update the database of Project X-Ray.
This database is organized according to the product families of target FPGAs and is
maintained in HTML format (22).
The database contains 2D tile arrays represented on the x- and y-axis (in the same
manner as the actual FPGA tile positions), as depicted in Fig. 3, as well as the bitstream information comprising each tile, as depicted in Fig. 4. The index database depicted in Fig. 3organizes the whole FPGA tile structure into 2D matrices, in which the position of
each tile is indicated by the x- and y-axis coordinates. Tiles with names starting
with ‘CLB’ are Configurable Logic Block tiles containing information on logic elements,
and tiles with ‘INT’ are Interconnect tiles containing PIP information.
The database contains information on the PIPs inside the FPGA, describing the patterns
in which the bits configuring the PIPs are distributed in the bitstream. Fig. 4presents a part of the 2D matrix of PIP configuration bits included in the INT tile
database page of the X-Ray database. The information contained in this matrix of PIP
configuration bits represents whether the PIPs, connections between various elements
in the FPGA, are active or not. The matrix is divided into multiple groups, in which
each group consists of multiple PIPs sharing the same PIP configuration bits within
the bitstream. Each group has several PIP starting points for a particular PIP end
point, and the number of PIP configuration bits varies with the number of PIP starting
points. For example, the highlighted IMUX_L13 (IM13) group is composed of 10 PIP configuration
bits. The values of these PIP configuration bits determine whether each PIP sharing
IMUX_L13 as a PIP end point is activated. Each PIP configuration bit is not one-to-one
mapping for PIPs, but instead encoded to activate single or multiple PIPs at the same
time.
Fig. 4. Part of 2D matrix of INT_L X2Y50 PIP configuration bits.
Fig. 5. Internal composition of LUT.
As illustrated in Fig. 3, a CLB tile is attached to a corresponding INT tile to form a pair. For example,
CLB_L X2Y50, which is the first CLB recorded in the bitstream, is connected to INT_L
X2Y50. Every CLB tile consists of 10 frames. An INT tile consists of 26 frames, where
each frame consists of 64 bits in the bitstream. The 10 frames of the CLB tiles are
used after the 26 frames of the INT tiles. Because the bitstream addresses of CLB
tiles can be found by using the internal LUT in experimental designs, the INT tile
position can also be determined.
III. Reverse Engineering of Xilinx 7-Series FPGA
1. PLP Reverse Engineering
The FPGA clock region has a total height of 50 tiles, as depicted in Fig. 2. The number of CLB tiles in the clock region is 12 × 50; these configure
the logic desired by the user through the internal LUT. Therefore, the bits configured
inside the CLB are termed PLPs. The CLB is divided into two slices, each containing
LUT, MUX, and FF modules.
As depicted in Fig. 5, each LUT consists of two 32 × 1 MUXs and one 2 × 1 MUX. Various
logic expressions are generated through 6-bit MUX signals, and the inputs to the LUT
are converted into output signals. The LUT outputs a variety of logical results through
six signals that are denoted as A1-A6. Signals A1-A5 control the two 32 ×
1 MUXs and signal A6 controls the 2 × 1 MUX. In the LUT structure, signal
A6 can be used as an input condition for a logical expression that further generates
one logical expression or can be used to allow one LUT to have two outputs. Because
the total number of expressions that can be output with six inputs is 64, the bitstream
comprising one LUT is 8 bytes long.
Table 1. FPGA LUT truth table with bit-swapping rule
O
|
A1
|
A2
|
A3
|
A4
|
A5
|
A6
|
OUT
|
63
|
0
|
0
|
0
|
0
|
0
|
0
|
0
|
47
|
1
|
0
|
0
|
0
|
0
|
0
|
1
|
62
|
0
|
1
|
0
|
0
|
0
|
0
|
0
|
46
|
1
|
1
|
0
|
0
|
0
|
0
|
1
|
61
|
0
|
0
|
1
|
0
|
0
|
0
|
0
|
45
|
1
|
0
|
1
|
0
|
0
|
0
|
1
|
60
|
0
|
1
|
1
|
0
|
0
|
0
|
0
|
44
|
1
|
1
|
1
|
0
|
0
|
0
|
1
|
15
|
0
|
0
|
0
|
1
|
0
|
0
|
1
|
31
|
1
|
0
|
0
|
1
|
0
|
0
|
0
|
14
|
0
|
1
|
0
|
1
|
0
|
0
|
1
|
30
|
1
|
1
|
0
|
1
|
0
|
0
|
0
|
13
|
0
|
0
|
1
|
1
|
0
|
0
|
1
|
29
|
1
|
0
|
1
|
1
|
0
|
0
|
0
|
12
|
0
|
1
|
1
|
1
|
0
|
0
|
1
|
28
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
59
|
0
|
0
|
0
|
0
|
1
|
0
|
0
|
43
|
1
|
0
|
0
|
0
|
1
|
0
|
1
|
58
|
0
|
1
|
0
|
0
|
1
|
0
|
0
|
42
|
1
|
1
|
0
|
0
|
1
|
0
|
1
|
57
|
0
|
0
|
1
|
0
|
1
|
0
|
0
|
41
|
1
|
0
|
1
|
0
|
1
|
0
|
1
|
56
|
0
|
1
|
1
|
0
|
1
|
0
|
0
|
40
|
1
|
1
|
1
|
0
|
1
|
0
|
1
|
11
|
0
|
0
|
0
|
1
|
1
|
0
|
1
|
27
|
1
|
0
|
0
|
1
|
1
|
0
|
0
|
10
|
0
|
1
|
0
|
1
|
1
|
0
|
1
|
26
|
1
|
1
|
0
|
1
|
1
|
0
|
0
|
9
|
0
|
0
|
1
|
1
|
1
|
0
|
1
|
25
|
1
|
0
|
1
|
1
|
1
|
0
|
0
|
8
|
0
|
1
|
1
|
1
|
1
|
0
|
1
|
24
|
1
|
1
|
1
|
1
|
1
|
0
|
0
|
55
|
0
|
0
|
0
|
0
|
0
|
1
|
0
|
39
|
1
|
0
|
0
|
0
|
0
|
1
|
0
|
54
|
0
|
1
|
0
|
0
|
0
|
1
|
1
|
38
|
1
|
1
|
0
|
0
|
0
|
1
|
1
|
53
|
0
|
0
|
1
|
0
|
0
|
1
|
1
|
37
|
1
|
0
|
1
|
0
|
0
|
1
|
1
|
52
|
0
|
1
|
1
|
0
|
0
|
1
|
0
|
36
|
1
|
1
|
1
|
0
|
0
|
1
|
0
|
7
|
0
|
0
|
0
|
1
|
0
|
1
|
0
|
23
|
1
|
0
|
0
|
1
|
0
|
1
|
1
|
6
|
0
|
1
|
0
|
1
|
0
|
1
|
1
|
22
|
1
|
1
|
0
|
1
|
0
|
1
|
0
|
5
|
0
|
0
|
1
|
1
|
0
|
1
|
1
|
21
|
1
|
0
|
1
|
1
|
0
|
1
|
0
|
4
|
0
|
1
|
1
|
1
|
0
|
1
|
0
|
20
|
1
|
1
|
1
|
1
|
0
|
1
|
1
|
51
|
0
|
0
|
0
|
0
|
1
|
1
|
0
|
35
|
1
|
0
|
0
|
0
|
1
|
1
|
0
|
50
|
0
|
1
|
0
|
0
|
1
|
1
|
1
|
34
|
1
|
1
|
0
|
0
|
1
|
1
|
1
|
49
|
0
|
0
|
1
|
0
|
1
|
1
|
1
|
33
|
1
|
0
|
1
|
0
|
1
|
1
|
1
|
48
|
0
|
1
|
1
|
0
|
1
|
1
|
0
|
32
|
1
|
1
|
1
|
0
|
1
|
1
|
0
|
3
|
0
|
0
|
0
|
1
|
1
|
1
|
0
|
19
|
1
|
0
|
0
|
1
|
1
|
1
|
0
|
2
|
0
|
1
|
0
|
1
|
1
|
1
|
1
|
18
|
1
|
1
|
0
|
1
|
1
|
1
|
0
|
1
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
17
|
1
|
0
|
1
|
1
|
1
|
1
|
0
|
0
|
0
|
1
|
1
|
1
|
1
|
1
|
0
|
16
|
1
|
1
|
1
|
1
|
1
|
1
|
1
|
Fig. 6. Frame configuration address for X25Y50 CLB.
In Vivado environment, a user can choose the LUT to use. Therefore, by analyzing each
bitstream generated by an LUT, the positions of the configuration bits of each LUT
can be determined. Logic consisting of one LUT is designed and bitstreams are configured
to utilize all the LUTs in the FPGA.
When one configuration bit of an LUT is written to the bitstream, 8 bytes are not
written simultaneously. Instead, 2 bytes are written at a time in 404-byte intervals.
In addition, the LUT designated as ABCD in the Vivado environment is not written in
this order but is written in the order BADC. Based on the bits of the active bitstream,
the frame configuration address database for the X25Y50 CLB can be constructed as
depicted in Fig. 6.
Activated LUTs can be extracted based on the contents of the database. A total of
8 bytes are extracted in blocks of 2 bytes at intervals of 404 bytes based on the
active start bit. A truth table can be written based on the six inputs of the LUT,
as presented in Table 1. Among the extracted 64 bits, the bit mapped to the top of the bitstream is the 63rd
bit and the bit mapped to the bottom of the bitstream is the zeroth bit. When the
extracted 64 bits are specified as output variable values in the truth table, mapping
does not follow the sequence from the 63rd bit to the zeroth bit, but it follows the
sequence listed in the column highlighted “O” in Table 1.
For example, when the 64 bits extracted from the most significant bit to the least
significant bit of the extracted bits from the active LUT are 0x3C033C033C033C03,
the bit-swapped value for matching the truth table output variable is 0x0F0FF0F000000F0F.
In the truth table with the extracted bits as the output variables, one can see that
A6 is used as the input variable for the logic when the expression using A6 is different
from the expressions that do not use A6. When A6 is used as an input variable for
a logical expression, the expression is used as an LUT having only one logical output.
If an expression that uses A6 as an input variable and an expression that does not
use A6 as an input variable are the same, it can be concluded that A6 is not used
as the input variable for the logical expression. The cases in which A6 = 0 and A6
= 1 are divided. When the logical expressions are not the same, they are used as an
LUT containing one logical expression, for when A6 = 0, and another logical expression,
for when A6 = 1, as outputs. When the logical expressions for A6 = 0 and 1 are the
same, a single logical expression that does not use A6 as an input variable becomes
the logical expression of the LUT with one output.
After determining A6 input, the truth table is converted into Boolean equation with
sum of products. Output O6, and O5 if the LUT has two outputs, are expressed as product
of input A1 to A6, and converted into minimized equation using Karnaugh map algorithm,
which is interpreted as the logic implemented in the LUT. For example, if the final
equation is O6 = A1A3, the logic in the LUT is interpreted as 2-input AND gate of
input A1 and A3, which is O6 = A1 & A3 in Verilog format.
2. PIP Reverse Engineering
As mentioned previously, the database of Project X-Ray’s provides detailed PIP bitstream
configuration information regarding the connections between PIPs. Whereas the length
of the PIP configuration bits differs for each PIP group, the bitstream structure
for the same type of tile is always the same. Therefore, if one tile is fully analyzed,
the results can be extended to the reverse engineering of other tiles of the same
type.
If the position in the bitstream of the first bit of a specific set of PIP configuration
bits is known, information regarding whether or not all PIPs in the same INT tile
are activated can be obtained. However, because Project X-Ray does not provide an
offset address for each tile, information on position regarding an INT tile within
the bitstream and the internal PIP configuration bits cannot be obtained using the
database of Project X-Ray database alone. Therefore, to acquire information on position
for each INT tile in the bitstream and for the PIP configuration bits, structural
analysis of the PIP configuration bits is required.
As depicted in Fig. 3, a specific INT tile is attached to an adjacent CLB tile, forming a pair of INT and
CLB tiles. For example, CLB_L X2Y50, which is the first CLB written in the bitstream,
is attached to INT_L X2Y50. All CLB tiles consist of 10 frames and INT tiles consist
of 26 frames. When information regarding each tile is written to the bitstream, the
10 frames of the CLB tile are written after the 26 frames of the INT tile. Because
the position of each LUT in the bitstream can be determined using the method described
above, the positions of the CLB tiles included in each LUT can be determined and the
positions of the INT tiles connected to the CLB tiles can also be determined.
In our experiments on the Artix-7 xc7a50tfgg484-1 FPGA, it was determined that the
PIP information of the INT tile INT_L X2Y50 in the FPGA was written from the address
“29,328 bytes” into the bitstream. As depicted in Fig. 4, INT_L X2Y50 consists of 26 frames. For each frame, information is written in 8-byte
(64 bits) chunks in the order from frame 0 to frame 25 in 404-byte intervals.
In this manner, the position of each INT tile in the bitstream can be determined and
the results can be organized into a database. The PIP information in the database
of Project X-Ray can then be used for the reverse engineering of all PIPs in the INT
tiles.
The PIP configuration bits in the INT tile database page are written to the bitstream
according to the prescribed rules. First, one column is referred to as one frame.
Each frame consists of a total of 64 bits. Each frame is divided into two parts, namely,
32 bits up and 32 bits down, as depicted in Fig. 7. The lower part is written first followed by the upper part. Each part is written
to the bitstream in the order from top to bottom. Next, one frame is written from
the zeroth bit to the 31st bit, followed by writing from the 32nd bit to the 63rd
bit. The order in which each frame is written to the bitstream is from left to right,
with 404-byte intervals between each frame. For example, when the address in the bitstream
of the start bit of the first frame is n bits, the address in the bitstream of the
start bit of the next frame is n bits + 404 bytes. In addition, one can see that the
PIP configuration bits for each PIP group (e.g., IM13) are distributed over several
frames.
Table 2. PIP configuration bits connected using IMUX_L13 (PIP end point)
PIP starting point
|
PIP configuration bits (frame_bit)
|
16_42
|
17_42
|
18_43
|
19_43
|
20_42
|
21_42
|
22_42
|
23_42
|
24_42
|
25_42
|
FAN_BOUNCE_S3_4
|
-
|
-
|
-
|
-
|
1
|
-
|
1
|
-
|
1
|
1
|
LOGIC_OUTS_L6
|
-
|
-
|
-
|
-
|
-
|
1
|
1
|
-
|
1
|
1
|
NL1BEG_N3
|
1
|
-
|
-
|
-
|
-
|
-
|
0
|
1
|
1
|
1
|
EL1END2
|
-
|
1
|
-
|
-
|
-
|
-
|
0
|
1
|
1
|
1
|
WL1END2
|
-
|
-
|
1
|
-
|
-
|
-
|
0
|
1
|
1
|
1
|
. . .
|
|
|
|
|
|
|
|
|
|
|
SL1END2
|
-
|
-
|
-
|
1
|
-
|
-
|
0
|
1
|
1
|
1
|
FAN_BOUNCE3
|
-
|
-
|
-
|
-
|
1
|
-
|
0
|
1
|
1
|
1
|
LOGIC_OUTS_L10
|
-
|
-
|
-
|
-
|
-
|
1
|
0
|
1
|
1
|
1
|
Table 2 summarizes the PIPs belonging to a PIP group in an INT tile database as well as the
PIP configuration bits that activate the PIPs. The top row represents the position
of each PIP configuration bit, indicating the coordinate value in the 2D array in
the database page of the corresponding INT tile. For example, 16_42 represents the
42nd bit of the 16th frame.
The following represents the value of each PIP configuration bit: A “-” indicates
“do not care,” meaning that the bits marked “-” do not affect the activation of the
corresponding PIP. For example, when the PIP connecting EL1END2 and IMUX_L13 is activated,
the value of “-1----0111” is written for the corresponding PIP configuration bits
(10 bits in total).
Fig. 7. FPGA configuration sequence according to current frame.
3. Multiple PLP/PIP Matching
From PIP information, we can determine the connectivity between the inputs of multiple
LUTs. If multiple input connections to the LUTs share the same PIP starting point,
they can be defined as one input signal. For example, if two inputs IMUX_L20 and IMUX_L42
connected to two different LUTs have the same PIP starting point SE2END1, we combine
the two inputs to generate one input signal. Based on the combined signals, we can
combine each logic function from the LUTs, acquired from the PLP information, into
a final complete logic function.
IV. Experiments
1. PLP Extraction
Based on the LUT configuration database discussed in Section III, all the bitstreams
of an activated LUT can be obtained. In the 3-bit adder targeted in this study, the
bits comprising LUT C and LUT D of X2Y50 are activated, as presented in Table 3. One LUT consists of a total of 64 bits; therefore, we can acquire 128 bits by configuring
two LUTs. The acquired bits are rearranged according to truth table rules, as indicated
in Table 1, and then used as output variables (denoted as OUT).
Table 3. Connection information of active LUTs and PIPs
Starting PIP
|
End PIP
|
Active LUT
|
Select variable
|
Conversion variable
|
EL1END3
|
IMUX_L37
|
LUT D
|
A4
|
|
SE2END1
|
IMUX_L42
|
LUT D
|
A6
|
a[1]
|
SE2END1
|
IMUX_L20
|
LUT C
|
A2
|
a[1]
|
EE2END0
|
IMUX_L33
|
LUT C
|
A1
|
a[0]
|
EE2END0
|
IMUX_L41
|
LUT D
|
A1
|
a[0]
|
SL1END3
|
IMUX_L46
|
LUT D
|
A5
|
|
SS2END3
|
IMUX_L39
|
LUT D
|
A3
|
b[1]
|
SS2END3
|
IMUX_L23
|
LUT C
|
A3
|
b[1]
|
SS2END2
|
IMUX_L36
|
LUT D
|
A2
|
b[0]
|
SS2END2
|
IMUX_L21
|
LUT C
|
A4
|
b[0]
|
VCC
|
IMUX_L34
|
LUT C
|
A6
|
|
LOGIC_OUTS_L11
|
NW2BEG3
|
LUT D
|
OUT
|
|
LOGIC_OUTS_L10
|
WR1BEG3
|
LUT C
|
OUT
|
|
LOGIC_OUTS_L18
|
WW2BEG0
|
LUT C
|
OUT
|
|
As mentioned in Section III.1, If the expression with A6 = 0 and A6 = 1 are different,
then A6 is used as the input variable for the logical expression. For the extracted
LUT C, the logic expressions including A6 as an input variable and not including A6
are different, A6 is not used as the input variable. In addition, because the output
variables are not identical for the cases in which A6 = 0 and A6 = 1, the LUT has
two outputs. Based on this information, applying Karnaugh map based logic interpretation
algorithm, two logical expressions are extracted from truth table written from LUT
C, which are O6 = (A2⊕A3)⊕A1A4 (when A6 = 0) and O5 = A1A4 (when A6
= 1).
LUT D has the same logical expression for the cases in which A6 is included as an
input variable and A6 is not included as an input variable. Therefore, LUT D has only
one logical expression with A6 as the input variable. The logical expression that
can be obtained from LUT D is O6 = (A4⊕A5)⊕A3A6 + A1A2A3 + A1A2A6.
Although the logic in each activated LUT can be extracted, correlation information
between the two LUTs cannot be obtained using the experimental method described above.
The PIP information connecting two or more LUTs must be obtained to extract information
regarding combinational logic circuits using two or more LUTs.
2. PIP Extraction
Extracting PIP information is essential for extracting information from more than
one LUT. Most PIPs are configured inside INT tiles, which are paired with CLB tiles.
In this study, the activated CLB tile was X2Y50 and the INT tile attached to the X2Y50
CLB tile was also located at X2Y50.Using the pattern discussed in Section III.2, the
PIP was configured inside the bitstream and all configuration bits of the X2Y50 INT
tile were obtained by analyzing the configuration data. Only the PIP-related bits
activated in the obtained PIP configuration bits could be arranged. The activated
PIP information is listed in Table 3.
By referring to Table 3, all the information regarding PIP connection points can be obtained. The connection
information results for the two LUTs are summarized in Fig. 8. One can see that the PIPs with the same starting points are applied to the same
input. The PIPs with the same starting points as the conversion variables presented
in Table 3 can be converted into a[0], a[1], b[0], and b[1].
{3. Combinational Logic Circuit Extraction}
Based on the PLP and PIP information obtained in Sections IV.1 and IV.2, the connection
point information between the LUTs can be obtained, as depicted in Fig. 8. In this section, we derive a combinational logic circuit using two or more LUTs.
The three logical expressions extracted in Section IV.1 can be converted into conversion
variables as follows by referring to Fig. 8 and Table 3:
Table 4. Performance comparison of FPGA reverse engineering tools
Reverse tool
|
Extractable points
|
FPGA reverse engineering method
|
Available environment
|
PLP
|
PIP
|
Debit [12]
|
△
|
△
|
XDL correlation
|
ISE
|
BIL [13]
|
X
|
△
|
XDLRC correlation
|
ISE
|
Bit2ncd [11]
|
O
|
O
|
XDL correlation
|
ISE
|
Logic resynthesizing [17]
|
△
|
X
|
Bitstream analysis
|
ISE, Vivado
|
Proposed
|
O
|
O
|
Bitstream analysis
|
ISE, Vivado
|
O: complete / △: incomplete / X: impossible
Sum[0]:a[0]b[0],
Sum[1]:(a[1]⊕b[1])⊕a[0]b[0],
Sum[2]:(A4⊕A5)⊕a[1]b[1]+a[0]b[0]b[1]+a[0]a[1]b[0].
Fig. 8. Connection information for two activated LUTs.
For verification, the truth table for the circuit written in Verilog with Vivado was
compared to the truth table written from extracted logic of the two LUTs. The extracted
logic and the original logic in Vivado were exactly the same, indicating that the
3-input adder was successfully recreated in Vivado environment. This proves that the
logic inside the FPGA was successfully extracted using bitstream analysis. A comparison
between the performance of conventional reverse engineering tools and that of our
proposed method is presented in Table 4.
Logic can be extracted from LUT in the form of gate-level netlist, but not as much
as the original user-written Verilog code (e.g., [2:0] C = [1:0] A + [1:0] B). In
future work, we plan to study on converting extracted logic into more user-friendly
Verilog code format.
V. Conclusions
The existing FPGA reverse engineering method using Vivado is unable to extract more
than one LUT because the PIP information cannot be extracted. In this study, PIP information,
which represents connection point information between the LUTs, was acquired using
the database of Project X-ray. A method for acquiring the address of each INT tile
was proposed, which enabled us to obtain all the activated PIP information. The acquired
PIP information was extended to logic extraction combining two or more LUTs. The PLP
and PIP information was completely separated from the bitstream. We then grouped PIP
information related to the same starting point to combine the two types of information
into a single circuit. To verify the proposed bitstream analysis method, we designed
a 3-bit adder and extracted the circuit information implemented in the FPGA from the
bitstream. Using only bitstream analysis, a 3-bit adder consisting of two LUTs was
successfully extracted. The extracted logic was confirmed to be exactly the same as
the design information in the actual Vivado environment, thus verifying the proposed
FPGA reverse engineering method.
In addition to CLB and INT tiles, there are many more types of tiles inside FPGAs,
including IOB, DSP, and BRAM tiles. If a circuit is designed using such tiles, the
proposed method cannot perform complete reverse engineering on FPGAs in Vivado environment.
In the future, the proposed reverse engineering method will be extended to facilitate
complete reverse engineering by adding reverse engineering methods for additional
types of tiles.
ACKNOWLEDGMENTS
This work was supported as part of Military Crypto Research Center (UD170109ED) funded
by Defense Acquisition Program Administration (DAPA) and Agency for Defense Development
(ADD).
REFERENCES
Virtex-5 FPGA Configuration User Guide UG071 , https://www.xilinx.com/
Moradi A., Oct 2011, On the vulnerability of FPGA bitstream encryption against power
analysis attacks, in Proc of the Conf on Computer and Communications Security, pp.
111-124
Yu H., Lee H., Lee S., Kim Y., Lee H. M., 2018, Recent Advances in FPGA Reverse Engineering,
Electronics, Vol. 7, No. 10, pp. 246
Facon A., Guilley S., Ngo X. T., Perianin T., Mar 2019, Hardware-enabled AI for Embedded
Security: A New Paradigm, in Proc of the 3rd International Conference on Recent Advances
in Signal Processing Telecommunications & Computing (SigTelCom 2019), pp. 80-84
Malhotra S., Borer T., Singh D., Brown S., Dec 2004, The Quartus University Interface
Program Enabling advanced FPGA research, in Proc of the IEEE International Conference
on Field-Programmable Technology, pp. 225-230
Lee J. K., 2012, Verilog functional model extraction from FPGA design data, J KIISE
Comput Pract Lett, Vol. 18, pp. 380-388
Tavaragiri A., Couch J., Athanas P., Feb 2011, Exploration of FPGA interconnect for
the design of unconventional antennas, in Proc of the ACM/ SIGDA International Symposium
on Field Programmable Gate Arrays, pp. 219-226
Lavin C., Dec 2010, Rapid prototyping tools for FPGA designs: Rapidsmith, in Proc
of the International Conference on Field-Programmable Technology IEEE, pp. 353-356
Lavin C., Sep 2011, Rapidsmith: Do-it-yourself CAD tools for Xilinx FPGAs, in Proc
of the International Conf. on Field Programmable Logic and Applications, pp. 349-355
Soni R. K., 2013, Open-source bitstream generation for FPGAs, Ph D dissertation Virginia
Tech
Ding Z., 2013, Deriving an NCD file from an FPGA bitstream: Methodology, architecture
and evaluation, Microprocess & Microsystems, Vol. 37, No. 3, pp. 299-312
Note J. B., Rannaud E., Feb 2008, From the bitstream to the netlist, in Proc of the
International ACM/SIGDA Symposium on Field Programmable Gate Arrays (FPGA), Vol. 18,
pp. 264-264
Benz F., Seffrin A., Huss S. A., Aug 2012, Bil: A tool-chain for bitstream reverse-engineering,
in Proc of the International Conf on Field Programmable Logic and Applications (FPL),
pp. 735-738
Zhang T., Wang J., Guo S., Chen Z., 2019, A comprehensive FPGA reverse engineering
tool-chain: From bitstream to RTL code, in Proc of IEEE Access, pp. 38379-38389
Choi S., Park J., Yoo H., Jan 2020, Reverse Engineering for Xilinx FPGA chips using
ISE Design Tools, Journal of Integrated Circuits and Systems, Vol. 6, No. 1
Choi S., Yoo H., 2020, Fast Logic Function Extraction of LUT from Bitstream in Xilinx
FPGA, Electronics, Vol. 9, No. 7, pp. 1132
Jeong M., May 2018, Extract LUT logics from a downloaded bitstream data in FPGA, in
Proc of 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5
Project X-Ray , June 2020, Available online: https://prjxray.readthedocs.io/en/latest/
Project IceStorm , June 2020, Available online: http://www.clifford.at/icestorm/
A Free and Open Source Verilog-to-Bitstream Flow for iCE40 FPGAs , June 2020, Available
online: https://media.ccc.de/v/32c3-7139-a\_free\_and\_open\_source\_verilog-to-bitstream\_flow\_for\_ice40\_fpgas/
Project X-Ray , June 2020, Documenting the Xilinx 7-series bit-stream format. Available
online: https://github.com/SymbiFlow/prjxray/
Project X-Ray , June 2020, Available online: https://symbiflow.github.io/prjxray-db/
Author
received a BSc in computer engineering from Shinhan University, Seoul, South Korea,
in February 2018.
He received M.S. degree at Kwangwoon University, Seoul, South Korea, in 2020.
His research interests include FPGA reverse engineering for hardware security and
digital integrated circuits.
received a B.S. degree in computer engineering from Kwangwoon University, Seoul, South
Korea, in 2019.
He is currently working towards a M.S. degree at Hongik University, Seoul, South Korea.
His research interests include FPGA SoC design, FPGA reverse engineering for hardware
security and digital integrated circuits.
received a BSc in electronics and information engi-neering from Korea University,
Sejong, South Korea, in February 2018, and an MSc in electrical engineering from Korea
University, Seoul, South Korea, in August 2020.
His research interests include bio-application system and integrated circuits.
received the B.S. degree in electrical engineering (summa cum laude) from Korea University,
Seoul, South Korea, in 2006, the M.S. degree in electrical engineering from the Korea
Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2008,
and the Ph.D. degree in electrical and computer engineering from the Georgia Institute
of Technology, Atlanta, GA, USA, in 2014.
From 2014 to 2015, he was with the Massachusetts Institute of Technology as a Post-Doctoral
Researcher.
From 2015 to 2017, he was with the IBM T. J. Watson Research Center as a Research
Staff Member.
In 2017, he joined the School of Electrical Engineering, Korea University, where he
is currently an Assistant Professor.
His research area includes analog/mixed-signal/power-management IC and microsystem
design for biomedical, sensor, energy, and security applications.
received a BSc in electrical engineering from Yonsei University, Seoul, Korea, in
1999, and an MSc and a PhD in electrical engineering from the University of Michigan,
Ann Arbor, in 2003 and 2007, respectively.
He held a senior engineering position at Qualcomm in San Diego, CA.
He is currently an Associate Professor at Hongik University, Seoul, South Korea. Prior
to joining Hongik University, he was with the School of Computer and Information Engineering
at Kwangwoon University, Seoul, South Korea, and the School of Electrical and Computer
Engineering at the Ulsan National Institute of Science and Technology (UNIST), Ulsan,
South Korea.
His research interests include embedded systems, variability-aware design methodologies,
design for manufac-turability, design and technology co-optimization methodologies,
and low-power and 3D IC designs.