<p> 我用Vivado中的IP Catalog重新定制了这两个模块的IP,但不知什么原因,最终生成的系统没有能够正常工作。原始工程(软件版本更高)中的 .bit 文件直接下载到 FPGA 中是可以工作的(JTAG连接可以发现RISC-V CPU),说明代码结构是没有问题的。于是我用它作参考,自己移植一下这个SOC.</p>
<p> 上一篇帖子中我分享了E203 SOC RTL代码的仿真,将这套代码移植到FPGA的操作需要另外编写一个顶层模块,不用仿真测试的顶层模块(文件 tb_top.v)。在Perf-V自带FPGA工程中,顶层模块代码是 system.v, 相当于在 e203_soc_top 这个模块外面再套一层“壳",将FPGA管脚信号连上去。</p>
<p> 看看 e203_soc_top 模块的输入输出端口:</p>
<p> 时钟信号及使能:有高频(HF)和低频(LF)两个时钟输入,分别是16MHz和32.768kHz. 使能信号是给出去的,可以不连。</p>
<p><span style="color:#2980b9;"><strong><span style="font-family:Courier;"> input hfextclk,<br />
output hfxoscen,</span></strong></span></p>
<p><span style="color:#2980b9;"><strong><span style="font-family:Courier;"> input lfextclk,<br />
output lfxoscen,</span></strong></span><br />
<p><span style="color:#2980b9;"><strong><span style="font-family:Courier;"> input io_pads_jtag_TCK_i_ival,<br />
input io_pads_jtag_TMS_i_ival,<br />
input io_pads_jtag_TDI_i_ival,<br />
output io_pads_jtag_TDO_o_oval,<br />
output io_pads_jtag_TDO_o_oe,</span></strong></span><br />
<p><span style="color:#2980b9;"><strong><span style="font-family:Courier;"> input io_pads_gpio_0_i_ival,<br />
output io_pads_gpio_0_o_oval,<br />
output io_pads_gpio_0_o_oe,<br />
output io_pads_gpio_0_o_ie,<br />
output io_pads_gpio_0_o_pue,<br />
output io_pads_gpio_0_o_ds,</span></strong></span></p>
<p><span style="color:#2980b9;"><strong><span style="font-family:Courier;"> ......</span></strong></span><br />
QSPI flash连接的信号,其中 DQ 是双向端口,因此每个口也有6个信号。</p>
<p><span style="color:#2980b9;"><strong><span style="font-family:Arial;"> output io_pads_qspi_sck_o_oval,<br />
output io_pads_qspi_cs_0_o_oval,<br />
input io_pads_qspi_dq_0_i_ival,<br />
output io_pads_qspi_dq_0_o_oval,<br />
output io_pads_qspi_dq_0_o_oe,<br />
output io_pads_qspi_dq_0_o_ie,<br />
output io_pads_qspi_dq_0_o_pue,<br />
output io_pads_qspi_dq_0_o_ds,<br />
<p> 额外的控制信号:</p>
<p><span style="color:#2980b9;"><span style="font-family:Courier;"><strong> // Erst is input need to be pull-up by default<br />
input io_pads_aon_erst_n_i_ival,</strong></span></span></p>
<p><span style="color:#2980b9;"><span style="font-family:Courier;"><strong> // dbgmode are inputs need to be pull-up by default<br />
input io_pads_dbgmode0_n_i_ival,<br />
input io_pads_dbgmode1_n_i_ival,<br />
input io_pads_dbgmode2_n_i_ival,</strong></span></span></p>
<p><span style="color:#2980b9;"><span style="font-family:Courier;"><strong> // BootRom is input need to be pull-up by default<br />
input io_pads_bootrom_n_i_ival,</strong></span></span></p>
<p><span style="color:#2980b9;"><span style="font-family:Courier;"><strong> // dwakeup is input need to be pull-up by default<br />
input io_pads_aon_pmu_dwakeup_n_i_ival,</strong></span></span></p>
<p><span style="color:#2980b9;"><span style="font-family:Courier;"><strong> // PMU output is just output without enable<br />
output io_pads_aon_pmu_padrst_o_oval,<br />
output io_pads_aon_pmu_vddpaden_o_oval </strong></span></span><br />
<p> 在Perf-V提供demo的 system.v 文件中,定义了一些 FPGA 的输入输出端口。比如 Perf-V 板子上的 LED、RGB LED、按键、拨动开关、插座引出的I/O口、JTAG口等。特殊管脚比如时钟、JTAG,需要和板子适配;至于 E203 的 32 个GPIO口怎么对应到这些LED、开关、插座上去,可以随自己便。</p>
<p> E203的一个GPIO口有6个信号,只须对应到FPGA的一个引脚。在demo的 system.v 中是这样写的:</p>
<code>wire iobuf_gpio_7_o;
assign dut_io_pads_gpio_7_i_ival = iobuf_gpio_7_o & dut_io_pads_gpio_7_o_ie;</code></pre>
<p> 这里 IOBUF 的写法是 Xilinx 软件支持的,含义可以猜出来。输出使能信号 dut_io_pads_gpio_7_o_oe 决定了实际 FPGA 引脚是否驱动(输出),输入使能信号 dut_io_pads_gpio_7_o_ie 是决定引脚上的电平状态是否反映到内部信号,还有 dut_io_pads_gpio_7_o_pue (上拉允许) 和 dut_io_pads_gpio_7_o_ds (用途不明确?) 两个信号就没有使用。好象 FPGA 不支持动态改变引脚的上拉功能,所以用不上。</p>
<p> GPIO在板子上的分配,demo里面写得略复杂了一点。我就先给简化了:GPIO0~8 直接连到三个 RGB LED,其余连到 Arduino 接口,再剩下的连到 PMOD 口上去。</p>
<code> assign led0_r = gpio_0;
assign led0_g = gpio_1;
assign led0_b = gpio_2;
assign led1_r = gpio_3;
assign led1_g = gpio_4;
assign led1_b = gpio_5;
assign led2_r = gpio_6;
assign led2_g = gpio_7;
assign led2_b = gpio_8;
assign arduino_a = gpio_9;
assign arduino_a = gpio_10;
assign arduino_a = gpio_11;
assign arduino_a = gpio_12;
assign arduino_a = gpio_13;
assign arduino_a = gpio_14;
assign arduino_d = gpio_15;
assign arduino_d = gpio_16;
assign arduino_d = gpio_17;
assign arduino_d = gpio_18;
assign arduino_d = gpio_19;
assign arduino_d = gpio_20;
assign arduino_d = gpio_21;
assign arduino_d = gpio_22;
assign arduino_d = gpio_23;
assign arduino_d = gpio_24;
assign arduino_d = gpio_25;
assign arduino_d = gpio_26;
assign arduino_d = gpio_27;
assign arduino_d = gpio_28;
assign pmod_io = gpio_29;
assign pmod_io = gpio_30;
assign pmod_io = gpio_31;</code></pre>
<p> E203 SOC 的主时钟是 16MHz, 但开发板的晶振是 50MHz. 因此这里需要一个 PLL 把时钟转换一下。这就要用到 FPGA 提供的 IP 了。参考 demo 的实现,用一个 MMCM IP模块,得到 16MHz 和 8.388MHz 两个时钟,再将 8.388MHz 1/256分频到32768kHz左右作为 LF 时钟。暂时也不需要很准。其实 16MHz 也不是必须的,比如直接将 50MHz 分频到 25MHz 或 12.5MHz 给 RISC-V 用也问题不大,除了UART时钟受影响要重新设以外。</p>
<p> E203 SOC 的复位信号(输入)是 io_pads_aon_erst_n_i_ival, 在 demo 里面用了一个复位模块IP来提供,但是我之前遇到的失败就从这里引起。于是我不用它,简单点就用 PLL 的锁定信号和外部复位按键组合代替了:</p>
<code> assign reset_periph = ~ck_rst | ~mmcm_locked;
assign dut_io_pads_aon_erst_n_i_ival = ~reset_periph;</code></pre>
<p> JTAG的4个信号照着 demo 里面那样处理,这是必不可缺少的。QSPI 的信号是连到板子上 flash 的,用来存放RISC-V程序。在调试阶段可以先不管 QSPI.</p>
<p> 连接好 FPGA 接口以后就可以开始软件综合了,发现错误再回去改。</p>
<p> 还有一个工作是将 FPGA 的接口信号映射到 FPGA 管脚。demo工程里面已经有约束文件了,可以提取出来修改了用,不然全照着电路图一个一个设管脚就很繁琐了。</p>
<p> 完成综合实现之后,Vivado显示的 FPGA 资源使用情况:</p>
<p> 将生成的 .bit 文件下载到 FPGA 中之后,需要检查SOC的工作情况。这一步需要借助调试器,从SOC的JTAG口访问内部资源。这个JTAG不是FPGA的JTAG, 而是FPGA用户逻辑实现的和RISC-V相连的JTAG. 在开发板上两个JTAG口的针脚定义也一样,PerfXLab的下载线也应该是支持两种JTAG用法的。但是我不想拔来插去的,就用另外的工具来处理RISC-V这边的JTAG了。</p>
<p> 我用了一片 FT232R 当作调试硬件,用 openocd 连接开发板上的 E203 SOC:</p>
<p>有这样的信息提示就表示识别到 RISC-V CPU 了。</p>
<p> </p>
<p> 因为现在系统里面没有程序,CPU当处于异常状态。</p>
<p><span style="color:#8e44ad;">Open On-Chip Debugger<br />
> halt<br />
> reg pc<br />
pc (/32): 0x00000000</span></p>
<p><span style="color:#8e44ad;">> step<br />
halted at 0x0 due to step<br />
> mdw 0 4<br />
0x00000000: 00000000 00000000 00000000 00000000</span></p>
<p><span style="color:#8e44ad;">></span><br />
要执行程序,一种办法是用 openocd 直接将代码写到内存里面去,然后修改 PC 寄存器,执行。</p>
<p><span style="color:#8e44ad;">> load_image f:/rv32.hex<br />
1602 bytes written at address 0x80000000<br />
72 bytes written at address 0x80001000<br />
16 bytes written at address 0x80002fb8<br />
downloaded 1690 bytes in 1.156250s (1.427 KiB/s)</span></p>
<p><span style="color:#8e44ad;">> resume 0x80000000<br />
> halt<br />
halted at 0x80000090 due to debug interrupt<br />
></span><br />
<p><span style="color:#8e44ad;">GNU gdb (GDB)<br />
Copyright (C) 2019 Free Software Foundation, Inc.<br />
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html><br />
This is free software: you are free to change and redistribute it.<br />
There is NO WARRANTY, to the extent permitted by law.<br />
Type "show copying" and "show warranty" for details.<br />
This GDB was configured as "--host=i686-w64-mingw32 --target=riscv-nuclei-elf".<br />
Type "show configuration" for configuration details.<br />
For bug reporting instructions, please see:<br />
<http://www.gnu.org/software/gdb/bugs/>.<br />
Find the GDB manual and other documentation resources online at:<br />
<p><span style="color:#8e44ad;">For help, type "help".<br />
Type "apropos word" to search for commands related to "word".<br />
(gdb) set arch riscv:rv32<br />
The target architecture is assumed to be riscv:rv32<br />
(gdb) target remote localhost:3333<br />
Remote debugging using localhost:3333<br />
warning: No executable has been specified and target does not support<br />
determining executable automatically. Try using the "file" command.<br />
0x80000090 in ?? ()<br />
(gdb) x /3i $pc<br />
=> 0x80000090: sw gp,-140(t5)<br />
0x80000094: j 0x80000086<br />
0x80000096: auipc a0,0x2<br />
(gdb) si<br />
halted at 0x80000094 due to step<br />
0x80000094 in ?? ()<br />
(gdb) si<br />
halted at 0x80000086 due to step<br />
0x80000086 in ?? ()<br />
(gdb) si<br />
halted at 0x80000088 due to step<br />
0x80000088 in ?? ()<br />
(gdb) si<br />
halted at 0x8000008c due to step<br />
0x8000008c in ?? ()<br />
(gdb) si<br />
halted at 0x80000090 due to step<br />
0x80000090 in ?? ()</span></p>
<p> 上面载入内存的代码是E203代码树中测试指令集的一个程序,看起来,测试内容完成之后就进入一个循环了。单步跟踪的结果是 0x80000086 到 0x80000094 之间循环。</p>
<p> 看一下程序的 dump 内容对比:</p>
<code>80000086 <write_tohost>:
80000086: 4521 li a0,8
80000088: 30052073 csrs mstatus,a0
8000008c: 00001f17 auipc t5,0x1
80000090: f63f2a23 sw gp,-140(t5) # 80001000 <tohost>
80000094: bfcd j 80000086 <write_tohost></code></pre>
<p> 这样的程序没有进行 I/O 操作,不能从硬件上观察到变化。下面需要用程序测试下 GPIO.</p>
<p> E203 SOC 里的 GPIO 设备寄存器表:</p>
<p> 简单的 LED 操作只需要使能 GPIO 输出 (output_en对应位写1),然后写输出数据寄存器就可以了。GPIO的设备基地址是 <span style="color:#c0392b;">0x10012000</span>,因此直接写个程序来点亮板上的 RGB LED:</p>
<code class="language-cpp">#include<stdint.h>
typedef struct{
uint32_t pin;
uint32_t in_en;
uint32_t out_en;
uint32_t odata;
volatile GPIO_Type * GPIO = (volatile GPIO_Type *)0x10012000;
void micro_delay(void)
int i;
asm volatile ("nop\n");
asm volatile ("nop\n");
void test(void)
int r,g,b;
GPIO->out_en = 7;
int t;
int i;
GPIO->odata = 0x1FE; // show R
GPIO->odata = 0x1FD; // show G
GPIO->odata = 0x1FB; // show B
<p> 变色的效果就是 R/G/B 交替亮的时间不等,并逐渐改变时长造成。至于延时,用最简单的空指令加循环。</p>
<p> </p>
<p> 类比ARM cortex-M0平台的编译方法,用RISC-V的toolchain来干活:</p>
<p> 编译C文件:</p>
<p><span style="color:#c0392b;"><strong>riscv-nuclei-elf-gcc -c -Os led.c -march=rv32imc -mabi=ilp32</strong></span></p>
<p>指定了指令集分支为 RV32IMC, 这里用的指令简单,估计不指定也能工作。ABI指定如果省略了,GCC会报错。</p>
<p> 链接成ELF:</p>
<p><span style="color:#c0392b;"><strong>riscv-nuclei-elf-ld led.o -Ttext 0x80000000</strong></span></p>
<p>指定了代码段的位置,0x80000000 是 E203 SOC 的 ITCM RAM.</p>
<p> 反汇编看程序代码:</p>
<p><span style="color:#c0392b;"><strong>riscv-nuclei-elf-objdump -d a.out</strong></span></p>
<code>Disassembly of section .text:
80000000 <micro_delay>:
80000000: 06400793 li a5,100
80000004: 0001 nop
80000006: 0001 nop
80000008: 17fd addi a5,a5,-1
8000000a: ffed bnez a5,80000004 <micro_delay+0x4>
8000000c: 8082 ret
8000000e <test>:
8000000e: 7179 addi sp,sp,-48
80000010: d422 sw s0,40(sp)
80000012: 80001437 lui s0,0x80001
80000016: 0a442783 lw a5,164(s0) # 800010a4 <__global_pointer$+0xfffff800>
8000001a: c65e sw s7,12(sp)
8000001c: c462 sw s8,8(sp)
8000001e: c266 sw s9,4(sp)
80000020: c06a sw s10,0(sp)
80000022: d606 sw ra,44(sp)
80000024: d226 sw s1,36(sp)
80000026: d04a sw s2,32(sp)
80000028: ce4e sw s3,28(sp)
8000002a: cc52 sw s4,24(sp)
8000002c: ca56 sw s5,20(sp)
8000002e: c85a sw s6,16(sp)
80000030: 471d li a4,7
80000032: c798 sw a4,8(a5)
80000034: 1fe00b93 li s7,510
80000038: 1fd00c13 li s8,509
8000003c: 1fb00c93 li s9,507
80000040: 4d25 li s10,9
80000042: 4921 li s2,8
80000044: 4981 li s3,0
80000046: 8a4a mv s4,s2
80000048: 4481 li s1,0
8000004a: a82d j 80000084 <test+0x76>
8000004c: 7d000a93 li s5,2000
80000050: 0a442783 lw a5,164(s0)
80000054: 4b01 li s6,0
80000056: 0177a623 sw s7,12(a5)
8000005a: 033b1c63 bne s6,s3,80000092 <test+0x84>
8000005e: 0a442783 lw a5,164(s0)
80000062: 4b01 li s6,0
80000064: 0187a623 sw s8,12(a5)
80000068: 03649863 bne s1,s6,80000098 <test+0x8a>
8000006c: 0a442783 lw a5,164(s0)
80000070: 4b01 li s6,0
80000072: 0197a623 sw s9,12(a5)
80000076: 034b4463 blt s6,s4,8000009e <test+0x90>
8000007a: 1afd addi s5,s5,-1
8000007c: fc0a9ae3 bnez s5,80000050 <test+0x42>
80000080: 0485 addi s1,s1,1
80000082: 1a7d addi s4,s4,-1
80000084: fc9954e3 bge s2,s1,8000004c <test+0x3e>
80000088: 0985 addi s3,s3,1
8000008a: 197d addi s2,s2,-1
8000008c: fba99de3 bne s3,s10,80000046 <test+0x38>
80000090: bf4d j 80000042 <test+0x34>
80000092: 37bd jal 80000000 <micro_delay>
80000094: 0b05 addi s6,s6,1
80000096: b7d1 j 8000005a <test+0x4c>
80000098: 37a5 jal 80000000 <micro_delay>
8000009a: 0b05 addi s6,s6,1
8000009c: b7f1 j 80000068 <test+0x5a>
8000009e: 378d jal 80000000 <micro_delay>
800000a0: 0b05 addi s6,s6,1
800000a2: bfd1 j 80000076 <test+0x68></code></pre>
<p>因为使用了压缩指令,所以指令长度是32-bit, 16-bit混合的。</p>
<p> 生成 bin 文件:</p>
<p><span style="color:#c0392b;"><strong>riscv-nuclei-elf-objcopy -Obinary a.out led.bin</strong></span></p>
<p> </p>
<p> 然后,在 openocd 中装载这个 .bin 程序文件:</p>
<p><strong><span style="color:#000000;">> halt</span><br />
<span style="color:#999999;">halted at 0x80000008 due to debug interrupt</span><br />
<span style="color:#000000;">> load_image f:/led.bin 0x80000000</span><br />
<span style="color:#999999;">4264 bytes written at address 0x80000000<br />
downloaded 4264 bytes in 2.734375s (1.523 KiB/s)</span></strong></p>
<p> 因为没有初始化代码,需要手动初始化一下堆栈,然后设置PC运行的地址:</p>
<p><span style="color:#000000;"><strong>> reg sp 0x90001000</strong></span></p>
<p><span style="color:#000000;"><strong>> resume 0x8000000e</strong></span></p>
<p> 这样就能从 test() 函数的入口开始运行了,可看到 RGB LED 的变化。</p>
<p>用Vivado显示的 FPGA 资源使用情况,这个功能还是不错的</p>
<p>直接用JTAG连接可以发现RISC-V CPU</p>