Verification Work:
1. Tiler GPU Unit.
Backend TB - Subsystem Level SystemVerilog.
(October 2022 - February 2023).
● Backend - RTL Description:
- Part of a Geometry Processing and Binning GPU Core.
- Transforms primitives into polygon lists.
2. Texture Mapper GPU Unit.
SystemVerilog.
(December 2021 – October 2022; February 2023 - present).
● Texture Mapper - RTL Description:
- Part of a GPU Shader Core.
- Performs texture lookups, filtering, and decompression for GPU shaders..
● Protocols:
- Message Fabric, AXI, ASN, CHP.
● Sign-off 4 projects.
● Team lead for 3 engineers.
3. Streaming Router Block.
(October 2021 – 2 months).
● Create vPlan (Streaming Router Block uses several protocols including UART, AST and APB).
● Write Verification Specification document.
● Develop VE using UVM-SV and CRCDV methodology.
● Scrum Master.
4. AMM Controller Block.
(September 2020 – 2 months).
● Create Sync Generator vPlan, Sync Interface vPlan and Reset Interface vPlan.
● Write Verification Specification document.
● Develop VE using UVM-SV and CRCDV methodology.
● Debug packet generation and driving.
● Create tests for DUT verification.
● Advanced automation of the verification flow.
● Run regressions.
● Analyze coverage and implement solution to fill it up.
● Write Verification Report.
● Develop the VIPs for educational purpose.
● Project Management was done using Agile Methodology.
5. Demultiplexer Block.
(June 2020 – 1.5 months).
● Create vPlan.
● Write Verification Specification document.
● Develop VE using UVM-SV and CRCDV methodology.
● Analyze coverage and implement solution to fill it up.
● Write Verification Report.
● Develop VIPs for educational purpose.
Design Work:
1. Accelerating an AI image upscaling algorithm on FPGA using High-Level Synthesis.
(October 2020 - 8 months).
● Implement the Fast Super-Resolution Convolutional Neural Network (FSR-CNN) in Python and C++.
● Optimize algorithm implementation for FPGA (reduce execution time by 42X, optimize used area), using
real-time coding style and pragmas.
● Convert the algorithm to HDL implementation.
● Synthesize and co-simulate the kernel.
● Verify functional correctness of the Verilog kernel.
● Port the kernel on PYNQ-Z1 FPGA using DMA and Vivado.
● Evaluate the quality of the upscaled images using PSNR – Peak Signal to Noise Ratio.
● Implement the bicubic interpolation algorithm in Python and C++, convert the algorithm in HDL implementation, optimize the kernel and port the RTL implementation on PYNQ-Z1 FPGA.
● Analyze and compare performance profile of the FSR-CNN with bicubic interpolation algorithm.
Technologies: Jupyter Notebook, Vivado HLS, Vivado, PYNQ-Z1, FPGA, Convolutional Neural Network, OpenCV, DMA, AXI4-Stream, AXI4-Lite, Pragmas.
2. Parallel to Serial Block.
(June 2020 – 0,5 months).
● Implement the Parallel to Serial Block based on the Design Specification document.
● Debug issues encountered during verification.
● Analyze the code coverage.
Additional Responsibilities:
● Write AMIQ Blog article - How to Accelerate an Image Upscaling CNN on FPGA using HLS.
● Mentor for 5 interns.
● Mentor for graduation thesis for a student - Develop a FCOV Maximizer using Python Reflection in CRCDV.