Memristor Hardware Breakthrough Enables Nonlinear Sorting

Summary: Researchers have built the first practical sort-in-memory hardware system that handles complex, nonlinear sorting without conventional comparators. By combining a novel Digit Read mechanism with a Tree Node Skipping algorithm and several cross-array scaling strategies, the team demonstrated a fast, energy-efficient, and reconfigurable memristor-based architecture suitable for AI, big data, and edge computing workloads.

Benchmark evaluations of a fabricated prototype show substantial improvements in throughput, energy consumption, and area efficiency compared with leading ASIC-based sorters, marking a major step forward for processing-in-memory (PIM) technology.

Key Facts:

  • Comparator-Free Design: The Digit Read method and Tree Node Skipping (TNS) algorithm remove traditional comparator logic, eliminating a major bottleneck for in-memory sorting.
  • Significant Performance Gains: The prototype achieved up to 7.7× higher throughput and 160× better energy efficiency than comparable ASIC sorters, while also improving area efficiency.
  • Wide Applicability: The architecture was validated on diverse workloads, including shortest-path search, neural network inference, and large-scale data sorting.

Source: Peking University

A research team led by Prof. Yang Yuchao from the School of Electronic and Computer Engineering at Peking University Shenzhen Graduate School has developed a breakthrough sort-in-memory system specifically designed to handle nonlinear sorting tasks efficiently.

Published in Nature Electronics, the work titled “A fast and reconfigurable sort-in-memory system based on memristors” presents a comparator-free PIM architecture that addresses one of the most persistent challenges in memristor-based computing: accelerating nonlinear operations such as sorting directly inside memory.

This shows a computer chip.
Together, these innovations form a flexible and adaptable sorting accelerator capable of handling varying data widths and complexities. Credit: Neuroscience News

Background

Sorting is a core operation across computing domains, but its nonlinear behavior has made it difficult to accelerate using in-memory techniques that excel at linear algebra. Memristor-based PIM has already shown promise for matrix and vector operations, yet sorting remained limited by reliance on comparison operations and comparator networks. This project rethinks sorting inside memory by removing comparators and leveraging per-cell digit reads to guide the process.

Why it matters

Transitioning PIM beyond linear workloads to efficient, high-complexity tasks like sorting expands the scope of memory-centric computing. A scalable, reconfigurable sort-in-memory accelerator can substantially reduce data movement, boost throughput, and lower energy use in AI training and inference, real-time analytics, graph algorithms, and edge devices where power and area are constrained.

Key findings

The system uses a one-transistor–one-resistor (1T1R) memristor array and a Digit Read mechanism that extracts digit-level information directly from memory cells. This digit-centric readout replaces traditional compare-select logic and enables a comparator-free sorting flow. Building on that foundation, the team introduced the Tree Node Skipping (TNS) algorithm, which optimizes traversal paths in the sort tree to avoid redundant operations and improve latency.

To scale throughput and adapt to different data types and sizes, the researchers developed three cross-array TNS strategies: Multi-Bank, Bit-Slice, and Multi-Level. The Multi-Bank approach partitions large datasets across multiple arrays for parallel processing. Bit-Slice distributes bit widths across arrays for pipelined sorting stages. Multi-Level exploits memristors’ multi-conductance states to increase intra-cell parallelism and support finer-grained operations. Combined, these methods produce a flexible accelerator that can be reconfigured for varying precision and throughput targets.

Application demonstrations

The team fabricated a memristor chip and integrated it with an FPGA and PCB to build a full end-to-end demonstration platform. In benchmarks against state-of-the-art ASIC sorters, the prototype delivered up to 7.70× improvement in speed, 160.4× improvement in energy efficiency, and 32.46× improvement in area efficiency. These gains stem from reduced data movement, parallel in-memory operations, and the elimination of comparator logic.

Real-world tests further illustrate the system’s versatility. For Dijkstra’s shortest-path computations, the platform computed shortest routes among 16 Beijing Metro stations with low latency and power use. In neural-network inference, the team combined TNS with memristor-based matrix-vector multiplication inside a PointNet++ implementation to enable run-time tunable sparsity; this integration yielded up to 15× faster inference and 67.1× better energy efficiency in the evaluated scenarios.

Future implications

By demonstrating a flexible, efficient, and scalable comparator-free sort-in-memory architecture, this work broadens the capabilities of memristor-based PIM. It establishes a practical path for accelerating nonlinear computations within memory arrays, enabling next-generation intelligent hardware for AI workloads, real-time analytics, graph processing, and edge computing. The techniques presented—Digit Read, Tree Node Skipping, and cross-array scaling strategies—provide a foundation for further research and system-level integration of in-memory sorting accelerators.

About this AI and computational neuroscience research news

Author: Jiang Zhang
Source: Peking University
Contact: Jiang Zhang – Peking University
Image: The image is credited to Neuroscience News

Original Research: Closed access.
“A fast and reconfigurable sort-in-memory system based on memristors” by Yang Yuchao et al. Nature Electronics


Abstract

A fast and reconfigurable sort-in-memory system based on memristors

Sorting is a foundational operation in modern computing, yet conventional hardware sorters are limited by von Neumann bottlenecks and the energy cost of moving data between memory and processors. Memristor-based sort-in-memory offers a promising alternative, but prior approaches still relied on comparison operations that limit scalability and efficiency.

This work describes a comparison-free sort-in-memory system that uses digit reads from one-transistor–one-resistor memristor arrays to perform sorting directly within memory. The digit-read Tree Node Skipping method supports a range of data sizes and types, while cross-array strategies—multi-bank, bit-slice, and multi-level—enable scalable performance across arrays and dataset configurations.

Experimental results from a fabricated prototype show substantial improvements in throughput, energy efficiency, and area efficiency compared with conventional sorting systems. The approach is applicable to practical tasks such as Dijkstra’s shortest-path search and neural-network inference with in situ pruning, demonstrating compatibility with other compute-in-memory schemes and broad relevance to AI and data-intensive applications.