Sedemos News

Jan 20, 2025

[paper] Spatz: open-source RISC-V compact VPU

Matteo Perotti, Samuel Riedel, Matheus Cavalcante and Luca Benini^1,2

Spatz: Clustering Compact RISC-V-Based Vector Units to Maximize Computing Efficiency

arXiv:2309.10137v2 [cs.AR] 9 Jan 2025

1 IIS, ETH Zurich (CH)
2 DEI, Uni. Bologna (IT)

Abstract: The ever-increasing computational and storage requirements of modern applications and the slowdown of technology scaling pose major challenges to designing and implementing efficient computer architectures. To mitigate the bottlenecks of typical processor-based architectures on both the instruction and data sides of the memory, we present Spatz, a compact 64-bit floating-point-capable vector processor based on RISC-V’s Vector Extension Zve64d. Using Spatz as the main Processing Element (PE), we design an open-source dual-core vector processor architecture based on a modular and scalable cluster sharing a Scratchpad Memory (SCM). Unlike typical vector processors, whose Vector Register Files (VRFs) are hundreds of KiB large, we prove that Spatz can achieve peak energy efficiency with a latch-based VRF of only 2 KiB. An implementation of the Spatz-based cluster in GlobalFoundries’ 12LPP process with eight double-precision Floating Point Units (FPUs) achieves an FPU utilization just 3.4% lower than the ideal upper bound on a double-precision, floating-point matrix multiplication. The cluster reaches 7.7 FMA/cycle, corresponding to 15.7 GFLOPSDP and 95.7 GFLOPSDP/W at 1 GHz and nominal operating conditions (TT, 0.80 V, 25 °C), with more than 55% of the power spent on the FPUs. Furthermore, the optimally-balanced Spatz-based cluster reaches a 95.0% FPU utilization (7.6 FMA/cycle), 15.2 GFLOPSDP, and 99.3 GFLOPSDP/W (61% of the power spent in the FPU) on a 2D workload with a 7 × 7 kernel, resulting in an outstanding area/energy efficiency of 171 GFLOPSDP/W/mm2. At equi-area, the computing cluster built upon compact vector processors reaches a 30% higher energy efficiency than a cluster with the same FPU count built upon scalar cores specialized for streambased floating-point computation.

Fig: Placed-and-routed Spatz-based shared-L1 cluster, implemented as a 737 μm × 1003 μm block. The cluster’s main blocks are highlighted: namely the Snitch cores, VRFs, IPUs, FPUs, L1 SPM, and I$.

Acknowledgment: This work was supported in part through the TRISTAN (#101095947) and the ISOLDE (#101112274) projects, both funded through the Chips Joint Undertaking (CHIPS JU) of the European Union’s Horizon Europe’s research and innovation programme and its members.

[book] From Code to Chip

Jakob Ratschenberger and Harald Pretl

From Code to Chip:

Open-Source Automated Analog Layout Design

pp: XV, 120 Publisher: Springer Cham (10 January 2025)

eBook ISBN 978-3-031-68562-0

This book shows how the layout of an analog circuit can be automatically generated in a fully open-source way. Based on an exemplary design flow, it introduces and explains the necessary steps for transforming a SPICE netlist into a layout, which can be inspected by the open-source layout editor Magic VLSI. This is done by using the industry’s first open-source process design kit SKY130. Furthermore, the implementation of the design flow in the programming language Python is available as open-source on GitHub.

Authors' Affiliations

Johannes Kepler University, Linz, Austria

Table of contents (8 chapters)

Front Matter pp. i-xv
Download chapter PDF
Introduction pp. 1-4
Theoretical Basics pp. 5-13
Circuit Capturing pp. 15-36
PDK—Design Rule Capturing pp. 37-41
Placement pp. 43-55
Routing pp. 57-71
Experimental Results pp. 73-99
Outlook pp. 101-103
Back Matter pp. 105-120

[paper] Compact Model of Linear Passive IPD

Zhang, Zijian

Compact Model of Linear Passive Integrated Photonics Device

for Photon Design Automation

arXiv preprint: 2501.06774 (2025)

1 University of Electronic Science and Technology of China, Chengdu, 611731, China

Abstract: As integrated photonic systems grow in scale and complexity, Photonic Design Automation (PDA) tools and Process Design Kits (PDKs) have become increasingly important for layout and simulation. However, fixed PDKs often fail to meet the rising demand for customization, compelling designers to spend significant time on geometry optimization using FDTD, EME, and BPM simulations. To address this challenge, we propose a data-driven Eigenmode Propagation Method (DEPM) based on the unitary evolution of optical waveguides, along with a compact model derived from intrinsic waveguide Hamiltonians. The relevant parameters are extracted via complex coupled-mode theory. Once constructed, the compact model enables millisecond-scale simulations that achieve accuracy on par with 3D-FDTD, within the model’s valid scope. Moreover, this method can swiftly evaluate the effects of manufacturing variations on device and system performance, including both random phase errors and polarization-sensitive components. The data-driven EPM thus provides an efficient and flexible solution for future photonic design automation, promising further advancements in integrated photonic technologies.

Fig: Photon design automation workflow based on compact model

of linear passive optical waveguide

Supplementary information:

The time evaluations were conducted on a system equipped with an Intel i9-10850K processor, 64 GB DDR4 memory, and NVIDIA Quadro RTX 5000 professional graphics processor.

Jan 18, 2025

[paper] Strategic Thinking on Open-Source PDK

(Medium link)

Written by Jun-ichi OKAMURA

IEEE Senior member (Bio)

In 1990, NHK hailed Japan as an "lectronic powerhouse," spotlighting the semiconductor industry. Now, three decades later, the spotlight has swung back onto semiconductors - though the star this time is cutting-edge manufacturing technology. In this piece, however, The Author’d like to shift the focus to the design side of semiconductors. This article follows in the footsteps of several earlier posts: "The Tale of PDKs, Past and Present" posted Dec. 3, 2023; "A Qualitative Cost Analysis of the Semiconductor Business" posted Aug.1, 2024; and "Semiconductors We Want to Make, Semiconductors We Want to Use" posted Dec. 23, 2024. I’m grateful that these pieces still receive steady traffic, and The Author hopes they’ve helped broaden my understanding of the design aspects of semiconductors for those in the industry.

This time, under the title “Strategic Thinking on Open-Source PDK (Process Design Kit),” The Author aims to explain key points about open-source PDKs clearly. The discussion doesn’t stop at semiconductor designers — it also addresses perspectives relevant to foundries (semiconductor manufacturing service providers) and those planning semiconductor-related businesses. The Author hopes you’ll find it an engaging read and a helpful resource.

Note: The views expressed here are authors's own, based on past work experience, and do not represent any organization.

Jan 13, 2025

[paper] SPICE Compact FLASH Memory Model

Jung Rae Cho 1, Donghyun Ryu 2,3, Donguk Kim1, Wonjung Kim1, Yeonwoo Kim 2,3, Changwook Kim 1, Yoon Kim 4, Myounggon Kang 5, Jiyong Woo 6, and Dae Hwan Kim 1

Physics-Based SPICE-Compatible Compact Model of FLASH Memory

With Poly-Si Channel for Computing-in-Memory Applications

in IEEE Journal of the Electron Devices Society, vol. 13, pp. 1-7, 2025

doi: 10.1109/JEDS.2024.3511581.

1 School of Electrical Engineering, Kookmin University, Seoul 02707, South Korea
2 Department of Electrical and Computer Engineering, Seoul National University, Seoul 08826, South Korea
3 Inter-University Semiconductor Research Center, Seoul National University, Seoul 08826, South Korea
4 School of Electrical and Computer Engineering, University of Seoul, Seoul 02504, South Korea
5 School of Advanced Fusion Studies, University of Seoul, Seoul 02504, South Korea
6 School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, South Korea

ABSTRACT: Recently, three-dimensional FLASH memory with multi-level cell characteristics has attracted increasing attention to enhance the capabilities of artificial intelligence (AI) by leveraging computingin-memory (CIM) systems. The focus is to maximize the computing performance and design FLASH memory suitable for various AI algorithms, where the memory must achieve a highly controllable multi-level threshold voltage (VT). Therefore, we developed a SPICE compact model that can rapidly simulate charge trap FLASH cells for CIM to identify optimal programming conditions. SPICE simulation results of the transfer characteristics are in good agreement with the results of experimentally fabricated FLASH memory, showing a low error rate of 10%. The model was also validated against the results obtained from the TCAD tool, showing that a consistent VT change was computed in a shorter time than that required using TCAD. Then, the developed model was used to comprehensively investigate how single or multiple gate voltage (VG) pulses affect VT. Moreover, considering recent FLASH memory fabrication processes, we found that grain boundaries in polycrystalline silicon channel materials can be involved in deteriorating gate controllability. Therefore, optimizing the pulse scheme by correcting potential errors identified in advance through fast SPICE simulation can enable the accurate achievement of the specific analog states of the FLASH cells of the CIM architecture, boosting computing performance.

FIG: Device structure of FLASH memory cell for TCAD Sentaurus simulation and its transfer characteristics of FLASH memory obtained from measurement and SPICE simulation.

Acknowledgements: This work was supported in part by the Institute of Information and Communications Technology Planning and Evaluation (IITP) funded by the Korea Government (MSIT) under Grant 2021-0-01764-001; in part by the National Research Foundation of Korea (NRF) funded by the Korean Government (MSIT) under Grant RS-2023-00208661; in part by the Ministry of Trade, Industry & Energy (MOTIE) under Grant 1415187390; in part by the Korea Semiconductor Research Consortium (KSRC) support program for the Development of the Future Semiconductor Device under Grant 00231985; and in part by the 2023 Research Fund of Kookmin University, South Korea. The work of Jiyong Woo was supported by the National Research and Development Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT under Grant RS-2023-00258227.