Mobile QR Code

1. (School of Comp. & Info. Engineering, Kwangwoon University, Kwangwoon-ro 20, Nowon-gu, Seoul 01897, Korea)
2. (School of Electronic and Electrical Engineering, Hongik University, Wausan-ro 94, Mapo-gu, Seoul 04066, Korea)
3. (School of Electrical Engineering, Korea University, Seoul 02841, Korea)

FPGA reverse engineering, non-invasive attack, bitstream, logic extract, vivado design suite, project X-ray

## I. INTRODUCTION

In the early days of their development, field-programmable gate arrays (FPGAs) had limited use for testing early models of application-specific integrated circuits in areas such as aerospace engineering and space industry. However, the use of FPGAs has increased and diversified in various fields owing to the advantages of easy circuit development and continuous circuit updates. In particular, because FPGAs are key components in Internet of Things (IoT) devices, which constitute the most representative industry in the fourth industrial revolution era, the demand for FPGAs is growing exponentially. In addition, parallel data processing is possible with FPGAs. This feature is well-suited for use in image processing and artificial intelligence environments. However, owing to the rapid growth in demand for FPGAs, issues related to FPGA security have also become increasingly prominent.

FPGAs have the advantage of being reprogrammable, which facilitates easy hardware updates. Therefore, most FPGAs are based on static random-access memory (SRAM), and the programmable read-only memory (PROM) must be installed outside the FPGA to store FPGA configuration data (bitstreams). When an FPGA is powered on, the bitstream stored in the PROM is downloaded to the FPGA. This bitstream can then be extracted from the download path using a signal measuring instrument such as a logic analyzer. The reverse engineering approach of extracting bitstream files from external memory is a non-invasive approach that does not damage the FPGA. Therefore, this method may be widely used in fields that require non-invasive approaches to analyzing target hardware, such as criminal investigation, military defense, and security. Recently, manufacturers of FPGAs have offered new ways to store bitstreams via complex programmable logic devices or microprocessors, avoiding the need for PROMs (1).

An attacker is able to launch a Trojan attack that inserts malicious circuitry inside the FPGA through extracted bitstreams. The user’s personal information can then be leaked through these malicious circuits. To prevent attacks via bitstreams, various FPGA manufacturers provide bitstream encryption methods such as AES-256 encryption. However, it is essential to identify the root cause of the attacks and take appropriate countermeasures because various methods for extracting encryption keys are being actively researched (2). The FPGA reverse engineering method proposed in this study assumes that a bitstream is not encrypted, and this study can be extended to research on FPGA security.

Xilinx is the biggest FPGA manufacturer, which accounts for more than 60% of the market share. We aimed to perform reverse engineering on the most representative FPGA from Xilinx, which offers two types of design compilers: ISE design suite and Vivado design suite. The ISE design suite only supports families before the 7-series, whereas Vivado supports recent products such as 7-series families.

The reverse engineering of Xilinx FPGAs can be divided into programmable interconnect point (PIP) and programmable logic point (PLP) extraction. PIPs are used to interconnect input/output blocks (IOBs), configurable logic blocks (CLBs), and various tiles using programmable complementary metal-oxide semicon-ductor (CMOS) transistor switches. They are primarily configured inside interconnect (INT) tiles. PLPs are used to implement logic circuits inside CLBs. The interior of a CLB is composed of a lookup table (LUT), multiplexer (MUX), and carry module. Various logic circuits can be implemented according to LUT input combinations.

In the Xilinx ISE design suite, the Xilinx design language (XDL) is used to represent netlists in text format. Reverse engineering has been performed on families before the 7-series based on a database that analyzes the correlation between bitstreams and the XDL (3-14). Initially, reverse engineering was performed only on the correlations between the XDL and bitstreams, but this approach had very low accuracy because the XDL only includes information regarding programmable points and contains few correlated data points (12). To expand on a previous study (13), XDLRC (XDL report) files were added. These files record the structural information for all FPGA internal circuits as correlation data, leading to higher accuracy. However, reverse engineering was conducted only on the PIP information of INT tiles without considering the technical characteristics of XDLRC structures. Recent papers have reported the complete reverse engineering of FPGAs in ISE environment (14-16).

The Xilinx Vivado design suite environment, which supports the newest FPGA families, does not provide XDL or XDLRC generation functions for security reasons, implying that the previous reverse engineering method used in the ISE design suite environment is not applicable to the Vivado environment. Therefore, a new method is required to reverse engineer FPGAs in Vivado environment. As no existing reference information or database is provided to compare the compiled design with the bitstream, unlike the ISE environment, which provided XDL and XLDRC generation functions, it is essential to analyze the bitstream structure of FPGAs in Vivado environment in detail. Patterns in bitstream generation and bitstream address of data used for configuring each FPGA element are required for reverse engineering.

Jeong et al. (17) succeeded in extracting a single LUT by locating LUT configuration bits. Simple logic function consisting a single LUT was implemented and generated into bitstream using Vivado design suite. The design would not utilize other elements of the FPGA, and thus the generated bitstream would contain minimal data: '1' and '0' for active bits, '0' for most other bits. By locating active bits in the bitstream, it was possible to determine which parts of the bitstream are used to configure the LUT used in the design. Then, by comparing the bitstream generated from different logic functions and analyzing how the values change, it was possible to find a pattern in which LUT configuration bits are encoded. The extracted bits would be used to write a truth table which shows the output values for all each input values - the logic implemented in the LUT. As result of experiments, it was proven that extracting LUT logic directly from bitstream is possible. However, owing to the failure in obtaining PIP information, which implies the connection of signals across multiple LUTs, it was impossible to expand this method to reverse engineer a complex circuit consisting of two or more LUTs with connected inputs.

Fig. 1. Flowchart of proposed FPGA reverse engineering.

In our study, to extract PLP information, we created various sample designs and obtained the address of PLPs in the bitstream. The extracted bits were assigned to the outputs of a truth table based on the LUT structure to acquire the logic function of the circuit. PIP information was extracted based on the X-ray project (18), which provides PIP information on FPGAs in Vivado environment. In this study, we also propose a method that can combine multiple bits of PLP and PIP information to extract a complete circuit. With the proposed method, conventional research on extracting logic from a single LUT can be extended to extracting logic from multiple LUTs. The proposed reverse engineering flow for a Xilinx FPGA in Vivado environment is presented in Fig. 1.

The first step is to extract the bitstream stored in the PROM of the FPGA board. As soon as power is applied to the FPGA, the bitstream is downloaded from the PROM to the FPGA. Data are extracted using a data measurement instrument during download. Next, the extracted bitstream is used to identify the FPGA chip. Because FPGAs have slightly different structures among series, it is necessary to apply different reverse engineering algorithms based on chip identification. In this study, reverse engineering was performed on the Artix-7 Family xc7a50tfgg484-1 chip. Thereafter, the PIP and PLP are extracted separately, and one type of logic is extracted by combining the two types of programmable points (PIP and PLP).

The main contributions of our study are summarized as follows.

·We verified our reverse engineering method for the Xilinx 7-series Artix-7 Family xc7a50tfgg484-1 chip on Vivado design suite version 18.2.

·We proposed a reverse engineering method for FPGAs in Vivado environment that combines both PLP and PIP information to extract the complete logic composed of multiple LUTs.

·We provided guidelines on extracting PLP and PIP information from the bitstream for FPGAs in Vivado 7-series.

·We verified through experiment that our proposed method can extract the complete logic composed of two LUTs.

Fig. 2. Structure of Xilinx ARTIX-7 family xc7a50tfgg484-1 chip.

## II. Background

The structure of a Xilinx FPGA is presented in Fig. 2. The clock region consists of INT, IOB, CLB, BRAM, and DSP tiles. The INT tiles are used to connect various tiles, such as IOB and CLB, using internal CMOS transistor switches. The CLB tile is divided into two slices, each containing LUT, MUX, and flip-flop (FF) modules, in which each clock region contains a total of 600 CLBs.

A CLB’s internal LUT can generate a variety of logical expressions using six inputs controlled by a total of 64 bits. In the Vivado environment, a user can select and use a desired LUT. A single LUT can be used to create bitstreams continuously to create a sample bitstream group. The positions of the 64 bits that make up each LUT can be identified based on the active bitstream position.

### 1. Bitstream Structure

The bitstream structures (3) of all members of the Xilinx FPGA series are the same. A bitstream is divided into a header, command frame, configuration data, and cyclic redundancy check (CRC). The header contains information such as the chip name of the target FPGA, project name, creation date, and bitstream size. The command frame contains information such as a frame address register and IDCODE, which indicates the start position of the configuration data. Every FPGA is assigned a unique IDCODE, which identifies the type of FPGA. The configuration data determine the configuration of the FPGA’s internal circuitry. Depending on the configuration data values, various circuits can be designed within an FPGA. Finally, the CRC determines whether the stored bitstreams from the PROM have been successfully downloaded to the FPGA. The CRC function can be activated or deactivated. In this study, the CRC function was deactivated.

When a real bitstream is downloaded to an FPGA, the header data are deleted, and the download starts at the command frame. For the FPGA reverse engineering proposed in this study, the command frame and CRC information are not needed. Therefore, only configuration data are required.

Fig. 3. FPGA index database.

### 2. Project X-Ray

Project X-Ray is a reverse engineering project that analyzes Xilinx’s 7-series FPGAs and is a follow-up project to Project IceStorm (19,20), in which reverse engineering was performed on Lattice’s iCE40 FPGA. Because XDL and XDLRC files are not provided by the Vivado design suite, analysis using only bitstreams is necessary. Project X-Ray uses Vivado design suite to generate a large number of designs targeting 7-series FPGAs, performing similar to fuzz testing. Each design has different elements within the FPGA enabled or disabled, and the generated bitstream will have different values in addresses configuring each elements of the FPGA. By comparing the large group of designs and the generated bitstreams, Project X-Ray can build a database listing the element-bitstream correlation. The project is currently under development and available as open-source code, allowing for the collection of various device data from multiple users (21).

The basic flow of Project X-Ray toolchain is as follows. First, one creates sample designs that utilize specific devices in the target FPGA (Artix-7, Kintex-7, and Zynq-7 families). Next, the Xilinx Vivado design suite is used to output these designs as a bitstream. The output bitstream travels through the Project X-Ray toolchain, which outputs FPGA documentation data, containing information about the active elements within the design, and its graphical representation in the form of an HTML file. The FPGA documentation data is then uploaded to update the database of Project X-Ray. This database is organized according to the product families of target FPGAs and is maintained in HTML format (22).

The database contains 2D tile arrays represented on the x- and y-axis (in the same manner as the actual FPGA tile positions), as depicted in Fig. 3, as well as the bitstream information comprising each tile, as depicted in Fig. 4. The index database depicted in Fig. 3organizes the whole FPGA tile structure into 2D matrices, in which the position of each tile is indicated by the x- and y-axis coordinates. Tiles with names starting with ‘CLB’ are Configurable Logic Block tiles containing information on logic elements, and tiles with ‘INT’ are Interconnect tiles containing PIP information.

The database contains information on the PIPs inside the FPGA, describing the patterns in which the bits configuring the PIPs are distributed in the bitstream. Fig. 4presents a part of the 2D matrix of PIP configuration bits included in the INT tile database page of the X-Ray database. The information contained in this matrix of PIP configuration bits represents whether the PIPs, connections between various elements in the FPGA, are active or not. The matrix is divided into multiple groups, in which each group consists of multiple PIPs sharing the same PIP configuration bits within the bitstream. Each group has several PIP starting points for a particular PIP end point, and the number of PIP configuration bits varies with the number of PIP starting points. For example, the highlighted IMUX_L13 (IM13) group is composed of 10 PIP configuration bits. The values of these PIP configuration bits determine whether each PIP sharing IMUX_L13 as a PIP end point is activated. Each PIP configuration bit is not one-to-one mapping for PIPs, but instead encoded to activate single or multiple PIPs at the same time.

Fig. 4. Part of 2D matrix of INT_L X2Y50 PIP configuration bits.

Fig. 5. Internal composition of LUT.

As illustrated in Fig. 3, a CLB tile is attached to a corresponding INT tile to form a pair. For example, CLB_L X2Y50, which is the first CLB recorded in the bitstream, is connected to INT_L X2Y50. Every CLB tile consists of 10 frames. An INT tile consists of 26 frames, where each frame consists of 64 bits in the bitstream. The 10 frames of the CLB tiles are used after the 26 frames of the INT tiles. Because the bitstream addresses of CLB tiles can be found by using the internal LUT in experimental designs, the INT tile position can also be determined.

## III. Reverse Engineering of Xilinx 7-Series FPGA

### 1. PLP Reverse Engineering

The FPGA clock region has a total height of 50 tiles, as depicted in Fig. 2. The number of CLB tiles in the clock region is 12 ${\times}$ 50; these configure the logic desired by the user through the internal LUT. Therefore, the bits configured inside the CLB are termed PLPs. The CLB is divided into two slices, each containing LUT, MUX, and FF modules.

As depicted in Fig. 5, each LUT consists of two 32 ${\times}$ 1 MUXs and one 2 ${\times}$ 1 MUX. Various logic expressions are generated through 6-bit MUX signals, and the inputs to the LUT are converted into output signals. The LUT outputs a variety of logical results through six signals that are denoted as A1-A6. Signals A1-A5 control the two 32 ${\times}$ 1 MUXs and signal A6 controls the 2 ${\times}$ 1 MUX. In the LUT structure, signal A6 can be used as an input condition for a logical expression that further generates one logical expression or can be used to allow one LUT to have two outputs. Because the total number of expressions that can be output with six inputs is 64, the bitstream comprising one LUT is 8 bytes long.

Table 1. FPGA LUT truth table with bit-swapping rule

 O A1 A2 A3 A4 A5 A6 OUT 63 0 0 0 0 0 0 0 47 1 0 0 0 0 0 1 62 0 1 0 0 0 0 0 46 1 1 0 0 0 0 1 61 0 0 1 0 0 0 0 45 1 0 1 0 0 0 1 60 0 1 1 0 0 0 0 44 1 1 1 0 0 0 1 15 0 0 0 1 0 0 1 31 1 0 0 1 0 0 0 14 0 1 0 1 0 0 1 30 1 1 0 1 0 0 0 13 0 0 1 1 0 0 1 29 1 0 1 1 0 0 0 12 0 1 1 1 0 0 1 28 1 1 1 1 0 0 0 59 0 0 0 0 1 0 0 43 1 0 0 0 1 0 1 58 0 1 0 0 1 0 0 42 1 1 0 0 1 0 1 57 0 0 1 0 1 0 0 41 1 0 1 0 1 0 1 56 0 1 1 0 1 0 0 40 1 1 1 0 1 0 1 11 0 0 0 1 1 0 1 27 1 0 0 1 1 0 0 10 0 1 0 1 1 0 1 26 1 1 0 1 1 0 0 9 0 0 1 1 1 0 1 25 1 0 1 1 1 0 0 8 0 1 1 1 1 0 1 24 1 1 1 1 1 0 0 55 0 0 0 0 0 1 0 39 1 0 0 0 0 1 0 54 0 1 0 0 0 1 1 38 1 1 0 0 0 1 1 53 0 0 1 0 0 1 1 37 1 0 1 0 0 1 1 52 0 1 1 0 0 1 0 36 1 1 1 0 0 1 0 7 0 0 0 1 0 1 0 23 1 0 0 1 0 1 1 6 0 1 0 1 0 1 1 22 1 1 0 1 0 1 0 5 0 0 1 1 0 1 1 21 1 0 1 1 0 1 0 4 0 1 1 1 0 1 0 20 1 1 1 1 0 1 1 51 0 0 0 0 1 1 0 35 1 0 0 0 1 1 0 50 0 1 0 0 1 1 1 34 1 1 0 0 1 1 1 49 0 0 1 0 1 1 1 33 1 0 1 0 1 1 1 48 0 1 1 0 1 1 0 32 1 1 1 0 1 1 0 3 0 0 0 1 1 1 0 19 1 0 0 1 1 1 0 2 0 1 0 1 1 1 1 18 1 1 0 1 1 1 0 1 0 0 1 1 1 1 1 17 1 0 1 1 1 1 0 0 0 1 1 1 1 1 0 16 1 1 1 1 1 1 1

Fig. 6. Frame configuration address for X25Y50 CLB.

In Vivado environment, a user can choose the LUT to use. Therefore, by analyzing each bitstream generated by an LUT, the positions of the configuration bits of each LUT can be determined. Logic consisting of one LUT is designed and bitstreams are configured to utilize all the LUTs in the FPGA.

When one configuration bit of an LUT is written to the bitstream, 8 bytes are not written simultaneously. Instead, 2 bytes are written at a time in 404-byte intervals. In addition, the LUT designated as ABCD in the Vivado environment is not written in this order but is written in the order BADC. Based on the bits of the active bitstream, the frame configuration address database for the X25Y50 CLB can be constructed as depicted in Fig. 6.

Activated LUTs can be extracted based on the contents of the database. A total of 8 bytes are extracted in blocks of 2 bytes at intervals of 404 bytes based on the active start bit. A truth table can be written based on the six inputs of the LUT, as presented in Table 1. Among the extracted 64 bits, the bit mapped to the top of the bitstream is the 63rd bit and the bit mapped to the bottom of the bitstream is the zeroth bit. When the extracted 64 bits are specified as output variable values in the truth table, mapping does not follow the sequence from the 63rd bit to the zeroth bit, but it follows the sequence listed in the column highlighted “O” in Table 1.

For example, when the 64 bits extracted from the most significant bit to the least significant bit of the extracted bits from the active LUT are 0x3C033C033C033C03, the bit-swapped value for matching the truth table output variable is 0x0F0FF0F000000F0F.

In the truth table with the extracted bits as the output variables, one can see that A6 is used as the input variable for the logic when the expression using A6 is different from the expressions that do not use A6. When A6 is used as an input variable for a logical expression, the expression is used as an LUT having only one logical output.

If an expression that uses A6 as an input variable and an expression that does not use A6 as an input variable are the same, it can be concluded that A6 is not used as the input variable for the logical expression. The cases in which A6 = 0 and A6 = 1 are divided. When the logical expressions are not the same, they are used as an LUT containing one logical expression, for when A6 = 0, and another logical expression, for when A6 = 1, as outputs. When the logical expressions for A6 = 0 and 1 are the same, a single logical expression that does not use A6 as an input variable becomes the logical expression of the LUT with one output.

After determining A6 input, the truth table is converted into Boolean equation with sum of products. Output O6, and O5 if the LUT has two outputs, are expressed as product of input A1 to A6, and converted into minimized equation using Karnaugh map algorithm, which is interpreted as the logic implemented in the LUT. For example, if the final equation is O6 = A1A3, the logic in the LUT is interpreted as 2-input AND gate of input A1 and A3, which is O6 = A1 & A3 in Verilog format.

### 2. PIP Reverse Engineering

As mentioned previously, the database of Project X-Ray’s provides detailed PIP bitstream configuration information regarding the connections between PIPs. Whereas the length of the PIP configuration bits differs for each PIP group, the bitstream structure for the same type of tile is always the same. Therefore, if one tile is fully analyzed, the results can be extended to the reverse engineering of other tiles of the same type.

If the position in the bitstream of the first bit of a specific set of PIP configuration bits is known, information regarding whether or not all PIPs in the same INT tile are activated can be obtained. However, because Project X-Ray does not provide an offset address for each tile, information on position regarding an INT tile within the bitstream and the internal PIP configuration bits cannot be obtained using the database of Project X-Ray database alone. Therefore, to acquire information on position for each INT tile in the bitstream and for the PIP configuration bits, structural analysis of the PIP configuration bits is required.

As depicted in Fig. 3, a specific INT tile is attached to an adjacent CLB tile, forming a pair of INT and CLB tiles. For example, CLB_L X2Y50, which is the first CLB written in the bitstream, is attached to INT_L X2Y50. All CLB tiles consist of 10 frames and INT tiles consist of 26 frames. When information regarding each tile is written to the bitstream, the 10 frames of the CLB tile are written after the 26 frames of the INT tile. Because the position of each LUT in the bitstream can be determined using the method described above, the positions of the CLB tiles included in each LUT can be determined and the positions of the INT tiles connected to the CLB tiles can also be determined.

In our experiments on the Artix-7 xc7a50tfgg484-1 FPGA, it was determined that the PIP information of the INT tile INT_L X2Y50 in the FPGA was written from the address “29,328 bytes” into the bitstream. As depicted in Fig. 4, INT_L X2Y50 consists of 26 frames. For each frame, information is written in 8-byte (64 bits) chunks in the order from frame 0 to frame 25 in 404-byte intervals.

In this manner, the position of each INT tile in the bitstream can be determined and the results can be organized into a database. The PIP information in the database of Project X-Ray can then be used for the reverse engineering of all PIPs in the INT tiles.

The PIP configuration bits in the INT tile database page are written to the bitstream according to the prescribed rules. First, one column is referred to as one frame. Each frame consists of a total of 64 bits. Each frame is divided into two parts, namely, 32 bits up and 32 bits down, as depicted in Fig. 7. The lower part is written first followed by the upper part. Each part is written to the bitstream in the order from top to bottom. Next, one frame is written from the zeroth bit to the 31st bit, followed by writing from the 32nd bit to the 63rd bit. The order in which each frame is written to the bitstream is from left to right, with 404-byte intervals between each frame. For example, when the address in the bitstream of the start bit of the first frame is n bits, the address in the bitstream of the start bit of the next frame is n bits + 404 bytes. In addition, one can see that the PIP configuration bits for each PIP group (e.g., IM13) are distributed over several frames.

Table 2. PIP configuration bits connected using IMUX_L13 (PIP end point)

 PIP starting point PIP configuration bits (frame_bit) 16_42 17_42 18_43 19_43 20_42 21_42 22_42 23_42 24_42 25_42 FAN_BOUNCE_S3_4 - - - - 1 - 1 - 1 1 LOGIC_OUTS_L6 - - - - - 1 1 - 1 1 NL1BEG_N3 1 - - - - - 0 1 1 1 EL1END2 - 1 - - - - 0 1 1 1 WL1END2 - - 1 - - - 0 1 1 1 . . . SL1END2 - - - 1 - - 0 1 1 1 FAN_BOUNCE3 - - - - 1 - 0 1 1 1 LOGIC_OUTS_L10 - - - - - 1 0 1 1 1

Table 2 summarizes the PIPs belonging to a PIP group in an INT tile database as well as the PIP configuration bits that activate the PIPs. The top row represents the position of each PIP configuration bit, indicating the coordinate value in the 2D array in the database page of the corresponding INT tile. For example, 16_42 represents the 42nd bit of the 16th frame.

The following represents the value of each PIP configuration bit: A “-” indicates “do not care,” meaning that the bits marked “-” do not affect the activation of the corresponding PIP. For example, when the PIP connecting EL1END2 and IMUX_L13 is activated, the value of “-1----0111” is written for the corresponding PIP configuration bits (10 bits in total).

Fig. 7. FPGA configuration sequence according to current frame.

### 3. Multiple PLP/PIP Matching

From PIP information, we can determine the connectivity between the inputs of multiple LUTs. If multiple input connections to the LUTs share the same PIP starting point, they can be defined as one input signal. For example, if two inputs IMUX_L20 and IMUX_L42 connected to two different LUTs have the same PIP starting point SE2END1, we combine the two inputs to generate one input signal. Based on the combined signals, we can combine each logic function from the LUTs, acquired from the PLP information, into a final complete logic function.

## IV. Experiments

### 1. PLP Extraction

Based on the LUT configuration database discussed in Section III, all the bitstreams of an activated LUT can be obtained. In the 3-bit adder targeted in this study, the bits comprising LUT C and LUT D of X2Y50 are activated, as presented in Table 3. One LUT consists of a total of 64 bits; therefore, we can acquire 128 bits by configuring two LUTs. The acquired bits are rearranged according to truth table rules, as indicated in Table 1, and then used as output variables (denoted as OUT).

Table 3. Connection information of active LUTs and PIPs

 Starting PIP End PIP Active LUT Select variable Conversion variable EL1END3 IMUX_L37 LUT D A4 SE2END1 IMUX_L42 LUT D A6 a[1] SE2END1 IMUX_L20 LUT C A2 a[1] EE2END0 IMUX_L33 LUT C A1 a[0] EE2END0 IMUX_L41 LUT D A1 a[0] SL1END3 IMUX_L46 LUT D A5 SS2END3 IMUX_L39 LUT D A3 b[1] SS2END3 IMUX_L23 LUT C A3 b[1] SS2END2 IMUX_L36 LUT D A2 b[0] SS2END2 IMUX_L21 LUT C A4 b[0] VCC IMUX_L34 LUT C A6 LOGIC_OUTS_L11 NW2BEG3 LUT D OUT LOGIC_OUTS_L10 WR1BEG3 LUT C OUT LOGIC_OUTS_L18 WW2BEG0 LUT C OUT

As mentioned in Section III.1, If the expression with A6 = 0 and A6 = 1 are different, then A6 is used as the input variable for the logical expression. For the extracted LUT C, the logic expressions including A6 as an input variable and not including A6 are different, A6 is not used as the input variable. In addition, because the output variables are not identical for the cases in which A6 = 0 and A6 = 1, the LUT has two outputs. Based on this information, applying Karnaugh map based logic interpretation algorithm, two logical expressions are extracted from truth table written from LUT C, which are O6 = (A2$\oplus$A3)$\oplus$A1A4 (when A6 = 0) and O5 = A1A4 (when A6 = 1).

LUT D has the same logical expression for the cases in which A6 is included as an input variable and A6 is not included as an input variable. Therefore, LUT D has only one logical expression with A6 as the input variable. The logical expression that can be obtained from LUT D is O6 = (A4$\oplus$A5)$\oplus$A3A6 + A1A2A3 + A1A2A6.

Although the logic in each activated LUT can be extracted, correlation information between the two LUTs cannot be obtained using the experimental method described above. The PIP information connecting two or more LUTs must be obtained to extract information regarding combinational logic circuits using two or more LUTs.

### 2. PIP Extraction

Extracting PIP information is essential for extracting information from more than one LUT. Most PIPs are configured inside INT tiles, which are paired with CLB tiles. In this study, the activated CLB tile was X2Y50 and the INT tile attached to the X2Y50 CLB tile was also located at X2Y50.Using the pattern discussed in Section III.2, the PIP was configured inside the bitstream and all configuration bits of the X2Y50 INT tile were obtained by analyzing the configuration data. Only the PIP-related bits activated in the obtained PIP configuration bits could be arranged. The activated PIP information is listed in Table 3.

By referring to Table 3, all the information regarding PIP connection points can be obtained. The connection information results for the two LUTs are summarized in Fig. 8. One can see that the PIPs with the same starting points are applied to the same input. The PIPs with the same starting points as the conversion variables presented in Table 3 can be converted into a[0], a[1], b[0], and b[1].

{3. Combinational Logic Circuit Extraction}

Based on the PLP and PIP information obtained in Sections IV.1 and IV.2, the connection point information between the LUTs can be obtained, as depicted in Fig. 8. In this section, we derive a combinational logic circuit using two or more LUTs. The three logical expressions extracted in Section IV.1 can be converted into conversion variables as follows by referring to Fig. 8 and Table 3:

Table 4. Performance comparison of FPGA reverse engineering tools

 Reverse tool Extractable points FPGA reverse engineering method Available environment PLP PIP Debit [12] △ △ XDL correlation ISE BIL [13] X △ XDLRC correlation ISE Bit2ncd [11] O O XDL correlation ISE Logic resynthesizing [17] △ X Bitstream analysis ISE, Vivado Proposed O O Bitstream analysis ISE, Vivado

O: complete / △: incomplete / X: impossible

Sum[0]:a[0]b[0],

Sum[1]:(a[1]$\oplus$b[1])$\oplus$a[0]b[0],

Sum[2]:(A4$\oplus$A5)$\oplus$a[1]b[1]+a[0]b[0]b[1]+a[0]a[1]b[0].

Fig. 8. Connection information for two activated LUTs.

For verification, the truth table for the circuit written in Verilog with Vivado was compared to the truth table written from extracted logic of the two LUTs. The extracted logic and the original logic in Vivado were exactly the same, indicating that the 3-input adder was successfully recreated in Vivado environment. This proves that the logic inside the FPGA was successfully extracted using bitstream analysis. A comparison between the performance of conventional reverse engineering tools and that of our proposed method is presented in Table 4.

Logic can be extracted from LUT in the form of gate-level netlist, but not as much as the original user-written Verilog code (e.g., [2:0] C = [1:0] A + [1:0] B). In future work, we plan to study on converting extracted logic into more user-friendly Verilog code format.

## V. Conclusions

The existing FPGA reverse engineering method using Vivado is unable to extract more than one LUT because the PIP information cannot be extracted. In this study, PIP information, which represents connection point information between the LUTs, was acquired using the database of Project X-ray. A method for acquiring the address of each INT tile was proposed, which enabled us to obtain all the activated PIP information. The acquired PIP information was extended to logic extraction combining two or more LUTs. The PLP and PIP information was completely separated from the bitstream. We then grouped PIP information related to the same starting point to combine the two types of information into a single circuit. To verify the proposed bitstream analysis method, we designed a 3-bit adder and extracted the circuit information implemented in the FPGA from the bitstream. Using only bitstream analysis, a 3-bit adder consisting of two LUTs was successfully extracted. The extracted logic was confirmed to be exactly the same as the design information in the actual Vivado environment, thus verifying the proposed FPGA reverse engineering method.

In addition to CLB and INT tiles, there are many more types of tiles inside FPGAs, including IOB, DSP, and BRAM tiles. If a circuit is designed using such tiles, the proposed method cannot perform complete reverse engineering on FPGAs in Vivado environment. In the future, the proposed reverse engineering method will be extended to facilitate complete reverse engineering by adding reverse engineering methods for additional types of tiles.

### ACKNOWLEDGMENTS

This work was supported as part of Military Crypto Research Center (UD170109ED) funded by Defense Acquisition Program Administration (DAPA) and Agency for Defense Development (ADD).

### REFERENCES

1
Virtex-5 FPGA Configuration User Guide UG071 , https://www.xilinx.com/
2
Moradi A., Oct 2011, On the vulnerability of FPGA bitstream encryption against power analysis attacks, in Proc of the Conf on Computer and Communications Security, pp. 111-124
3
Yu H., Lee H., Lee S., Kim Y., Lee H. M., 2018, Recent Advances in FPGA Reverse Engineering, Electronics, Vol. 7, No. 10, pp. 246
4
Facon A., Guilley S., Ngo X. T., Perianin T., Mar 2019, Hardware-enabled AI for Embedded Security: A New Paradigm, in Proc of the 3rd International Conference on Recent Advances in Signal Processing Telecommunications & Computing (SigTelCom 2019), pp. 80-84
5
Malhotra S., Borer T., Singh D., Brown S., Dec 2004, The Quartus University Interface Program Enabling advanced FPGA research, in Proc of the IEEE International Conference on Field-Programmable Technology, pp. 225-230
6
Lee J. K., 2012, Verilog functional model extraction from FPGA design data, J KIISE Comput Pract Lett, Vol. 18, pp. 380-388
7
Tavaragiri A., Couch J., Athanas P., Feb 2011, Exploration of FPGA interconnect for the design of unconventional antennas, in Proc of the ACM/ SIGDA International Symposium on Field Programmable Gate Arrays, pp. 219-226
8
Lavin C., Dec 2010, Rapid prototyping tools for FPGA designs: Rapidsmith, in Proc of the International Conference on Field-Programmable Technology IEEE, pp. 353-356
9
Lavin C., Sep 2011, Rapidsmith: Do-it-yourself CAD tools for Xilinx FPGAs, in Proc of the International Conf. on Field Programmable Logic and Applications, pp. 349-355
10
Soni R. K., 2013, Open-source bitstream generation for FPGAs, Ph D dissertation Virginia Tech
11
Ding Z., 2013, Deriving an NCD file from an FPGA bitstream: Methodology, architecture and evaluation, Microprocess & Microsystems, Vol. 37, No. 3, pp. 299-312
12
Note J. B., Rannaud E., Feb 2008, From the bitstream to the netlist, in Proc of the International ACM/SIGDA Symposium on Field Programmable Gate Arrays (FPGA), Vol. 18, pp. 264-264
13
Benz F., Seffrin A., Huss S. A., Aug 2012, Bil: A tool-chain for bitstream reverse-engineering, in Proc of the International Conf on Field Programmable Logic and Applications (FPL), pp. 735-738
14
Zhang T., Wang J., Guo S., Chen Z., 2019, A comprehensive FPGA reverse engineering tool-chain: From bitstream to RTL code, in Proc of IEEE Access, pp. 38379-38389
15
Choi S., Park J., Yoo H., Jan 2020, Reverse Engineering for Xilinx FPGA chips using ISE Design Tools, Journal of Integrated Circuits and Systems, Vol. 6, No. 1
16
Choi S., Yoo H., 2020, Fast Logic Function Extraction of LUT from Bitstream in Xilinx FPGA, Electronics, Vol. 9, No. 7, pp. 1132
17
Jeong M., May 2018, Extract LUT logics from a downloaded bitstream data in FPGA, in Proc of 2018 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1-5
18
Project X-Ray , June 2020, Available online: https://prjxray.readthedocs.io/en/latest/
19
Project IceStorm , June 2020, Available online: http://www.clifford.at/icestorm/
20
A Free and Open Source Verilog-to-Bitstream Flow for iCE40 FPGAs , June 2020, Available online: https://media.ccc.de/v/32c3-7139-a\_free\_and\_open\_source\_verilog-to-bitstream\_flow\_for\_ice40\_fpgas/
21
Project X-Ray , June 2020, Documenting the Xilinx 7-series bit-stream format. Available online: https://github.com/SymbiFlow/prjxray/
22
Project X-Ray , June 2020, Available online: https://symbiflow.github.io/prjxray-db/

## Author

##### Hoyoung Yu

received a BSc in computer engineering from Shinhan University, Seoul, South Korea, in February 2018.

He received M.S. degree at Kwangwoon University, Seoul, South Korea, in 2020.

His research interests include FPGA reverse engineering for hardware security and digital integrated circuits.

##### Mannhee Cho

received a B.S. degree in computer engineering from Kwangwoon University, Seoul, South Korea, in 2019.

He is currently working towards a M.S. degree at Hongik University, Seoul, South Korea.

His research interests include FPGA SoC design, FPGA reverse engineering for hardware security and digital integrated circuits.

##### Sangil Lee

received a BSc in electronics and information engi-neering from Korea University, Sejong, South Korea, in February 2018, and an MSc in electrical engineering from Korea University, Seoul, South Korea, in August 2020.

His research interests include bio-application system and integrated circuits.

##### Hyung-Min Lee

received the B.S. degree in electrical engineering (summa cum laude) from Korea University, Seoul, South Korea, in 2006, the M.S. degree in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST), Daejeon, South Korea, in 2008, and the Ph.D. degree in electrical and computer engineering from the Georgia Institute of Technology, Atlanta, GA, USA, in 2014.

From 2014 to 2015, he was with the Massachusetts Institute of Technology as a Post-Doctoral Researcher.

From 2015 to 2017, he was with the IBM T. J. Watson Research Center as a Research Staff Member.

In 2017, he joined the School of Electrical Engineering, Korea University, where he is currently an Assistant Professor.

His research area includes analog/mixed-signal/power-management IC and microsystem design for biomedical, sensor, energy, and security applications.

##### Youngmin Kim

received a BSc in electrical engineering from Yonsei University, Seoul, Korea, in 1999, and an MSc and a PhD in electrical engineering from the University of Michigan, Ann Arbor, in 2003 and 2007, respectively.

He held a senior engineering position at Qualcomm in San Diego, CA.

He is currently an Associate Professor at Hongik University, Seoul, South Korea. Prior to joining Hongik University, he was with the School of Computer and Information Engineering at Kwangwoon University, Seoul, South Korea, and the School of Electrical and Computer Engineering at the Ulsan National Institute of Science and Technology (UNIST), Ulsan, South Korea.

His research interests include embedded systems, variability-aware design methodologies, design for manufac-turability, design and technology co-optimization methodologies, and low-power and 3D IC designs.