DCE Q816 – first processor in FPGA

This article is an archival post from my first website created around 2017. The quality of the graphics is not the best, as I unfortunately no longer have access to the originals, but for the sake of preserving the memory of this project, I am providing related materials.

The DCE system is a processor (DCE Q816) and coprocessor (DCE Q817) designed by me and run on FPGAs. It was created while I was learning the VHDL language and learning about FPGA circuits in general. I decided to share my project with others to make them curious about FPGAs.

The project was implemented on two Elbert v2 boards, equipped with Xilinx Spartan XC3S50A chips. As you can see, one as well as the other board has undergone appropriate modifications. On the DCE Q816 processor board, the 7-segment displays were disconnected. The DCE Q817 coprocessor board has undergone much more. Here, the 7-segment displays and five buttons were completely removed and the LEDs and Dipswitches were disconnected. Modifications were made to increase the number of available pins.

Below the FPGA circuit boards, we see three more components of the DCE system. These are an LCD display based on the HD44780 controller, a power section that supplies the appropriate voltage to the other circuits, and four 7-segment displays, of which three are active.

Above we can see a general schematic of the DCE system. The main component is the DCE Q816 processor, which contains the central processing unit (CPU) and RAM. Eight Dipswitches are connected to its input. Another component is the DCE Q817 coprocessor. It performs the function of the ROM, where the executed program is stored, and the graphics processor. It takes care of generating signals for LCD and LED displays connected to the coprocessor’s outputs.


DCE Q816 is an 8-bit processor realized on an FPGA chip. It works in conjunction with the DCE Q817 coprocessor. The chip was based on the MS2 architecture. It is my own concept based on the Harvard architecture (program and operational memory are physically separated). The processor is clocked with a 12MHz signal, but the chip supports software frequency change. We have the possibility to reduce this value even to 2.8Hz. The processor has an 8-bit input connector and a 16-bit output connector. In addition, the chip has an input for an external RESET signal.

Above we see that the CPU is based on four functional blocks. The CORE MS2 execution core is where all arithmetic and logic operations are performed. Control circuit this is where all control signals are generated. RAM (operational memory) with a capacity of 2kb (256B) realized on the memory block inside the FPGA chip. The last component is the I/O (input/output) block, which supports input/output ports.

The CORE MS2 logic (execution) core is the main component of the DCE Q816 processor. It is here that all arithmetic and logic operations are performed. It includes an arithmetic logic unit (ALU), an input multiplexer and a set of registers.

Register A is the first memory element, it is here that all data is stored regardless of further destination. Register B stores the second variable for the ALU. Register C stores the command for the ALU. The last register is STS. This is a 4-bit register that stores information about the so-called status bits. These are needed to execute conditional jump commands.

Carry bit

State bit

Cout

A=B

A>B

A<B

In the table we see the exact structure of the STS register. The first three bits (from the right) determine the relationship between the variable stored in the A register and B. Simply put, if the variable stored in B is greater than the one stored in A then the first bit (A<B) will take the value of one. The last bit of the STS register is Cout the so-called transfer bit. It informs that the result of the operation carried out by the arithmetic-logic unit (ALU) exceeded the range of 8 bits. The output of the STS register is connected to the control system.

The MUX, or input multiplexer, is the element that selects which input signal to write to the A register. This signal is selected based on the command being executed.

The main component of the CORE MS2’s execution core is the arithmetic logic unit (ALU). It is here that operations are performed on variables stored in registers A and B, according to the command stored in register C. In Table 2 we see all the commands supported by the ALU.

Binary code

Command

Description

0000

A

Send A

0001

A+1

Incrementation A

0010

A+B

Add

0011

A-B

Sub

0100

A and B

Logic operation

0101

A or B

0110

A xor B

0111

A nand B

1000

A nor B

1001

A xnor B

1010

not A

1011

PL

Shift left

1100

PR

Shift right

1101

RL

Rotate left

1110

RP

Rotate right

The I/O circuit is actually a set of three registers IN, OUT1, OUT2. As you can guess one of them is input the other two are output. The processor stores in them all the data it receives from outside and those it wants to send outside. It can be said that they act as communicators with the outside world.

The RAM block consists of two elements of the RAR register storing the address and the memory itself. The block available inside the FPGA chip was used to implement the memory. The memory cell addresses are 8 bits wide giving a total capacity of 2k bits (256 bytes).

The control circuit is the most advanced part of the processor. It is here that all control signals are generated based on the command and the clock signal. In addition, the control circuit includes: an address counter for the ROM along with registers storing addresses for jumps, a DB register storing a variable directly received from the ROM.

In the image we see simplified timing waveforms for the DCE Q816 processor. We can see that it takes four clock bars to execute a single command regardless of its type. Four signals are described in the figure. The first one is ROM Address, which is the address for the program memory. As you can see, the value of the address changes at the fourth clock bar, it can be an increase by one or a jump to the address stored in the PCL and PCH registers. The next signal is a variable appearing at the input of the processor. The variable appears there just after the address change. IR/DB data is the determination of the outputs of the IR (instruction register) and DB (direct data) registers. It is on the basis of the data stored in these registers that the command that the processor executes is determined. Writing to these registers is done on the first clock beat. The last signal determines what is at the output of the chip. Updating the contents of the OUT registers and most of the processor’s registers takes place on the second clock beat.

This is a rather simplified diagram of how the processor works, but it can be put even more simply. The chip takes instructions from ROM. It executes it. It increments or sets the address to fetch the next instruction and everything executes again.

Lp.

Kod binarny

Nazwa

Opis

1

00000

NOP

Nie rób nic

2

00001

AC

Prześlij zmienną z wyjścia CORE MS2 do A

3

00010

AIN

Prześlij zawartość z rejestru IN do A

4

00011

ADB

Prześlij zawartość z rejestru DB do A

5

00100

ARAM

Prześlij zawartość z pamięci RAM do A

6

00101

B

Prześlij zawartość z rejestru A do B

7

00110

C

Prześlij zawartość z rejestru A do C

8

00111

RAMA

Zapisz do rejestru adresowego pamięci RAM wartość z wyjścia CORE MS2

9

01000

RAMD

Zapisz do pamięci RAM wartość z wyjścia CORE MS2

10

01001

OUT1

Zapisz do rejestru OUT1 wartość z wyjścia CORE MS2

11

01010

OUT2

Zapisz do rejestru OUT2 wartość z wyjścia CORE MS2

12

01011

PCL

Zapisz do rejestru PCL wartość z wyjścia CORE MS2

13

01100

PCH

Zapisz do rejestru PCH wartość z wyjścia CORE MS2

14

01101

IN

Zapisz wartość do rejestru IN

15

01110

CLRA

Zeruj wartość rejestru A

16

01111

JMP

Skok bezwarunkowy

17

10000

JMP>

Skok jeżeli A>B

18

10001

JMP<

Skok jeżeli A<B

19

10010

JMP=

Skok jeżeli A=B

20

10011

JMPC

Skok jeżeli Cout=1

21

10100

JMPNC

Skok jeżeli Cout=0

22

10101

RESET

Ogólny reset procesora

23

10110

CF

Zmiana częstotliwości taktowania

In Table 3 we see a list of instructions executed by the DCE Q816 processor. This is also a good time to discuss the structure of the executed commands. A single instruction consists of two parts the first five bits are the general command (the value in Table 3) the remaining eight bits are the so-called direct data.

Direct data

Instruction

D12

D11

D10

D9

D8

D7

D6

D5

D4

D3

D2

D1

D0

Direct data has three functions. The first is simply a constant variable stored in the program, on which the relevant operations are later performed. The next meaning is related to the first. Initially, we treat it as a constant stored in the program. But later it is stored in the C register and becomes a command for the ALU. The values corresponding to the corresponding commands in Table 2. The last function of the direct variable is to determine the clock frequency of the processor. As I have already mentioned, the processor has the software ability to change the clock frequency. After entering the command 23 (Table 3) in the part corresponding to the instruction, the direct data becomes the determination of the processor’s clock frequency. In Table 6 we can see all the available values and the corresponding direct data.

Binary code

Frequence

00000

2,8Hz

00001

5,7Hz

00010

11Hz

00011

22Hz

00100

45Hz

00101

91Hz

00110

183Hz

00111

366Hz

01000

732Hz

01001

1,5kHz

01010

3kHz

01011

5kHz

01100

11kHz

01101

23kHz

01110

46kHz

01111

93kHz

10000

187kHz

10001

375kHz

10010

750kHz

10011

1,5MHz

10100

3MHz

10101

6MHz

10110

12MHz

The DCE Q817 coprocessor is the second component of the DCE system. It was realized on an identical board to the processor. The coprocessor performs the function of ROM, in which the program executed by the processor is stored. In addition, it deals with generating signals for the 7-segment displays based on data received from the processor. At this point, it does not interfere with the variables controlling the LCD display it simply transmits them. The signal generation is handled by the processor.

The ROM realized on the FPGA chip is actually RAM with initial values assigned. It is these values that are the program executed by the processor. The capacity of the ROM is 26kb (about 3kB).

As I mentioned, this is an archived description of a project I built for the purpose of learning the VHDL language. In the end, there were four versions of this project built in ISE, each of which I posted on my GitHub profile.

Want to stay up to date?
Join the newsletter

Sign up and receive notifications of new articles, tidbits and short notes describing what I am currently working on.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top