Sometimes, you just need things to be fast. Like, really fast. That was the core idea simmering in my head that eventually led to "Bulwark," an internal project I've been pouring time into lately. Inspired by CQEngine (a fantastic Java library tackling similar problems), I wanted something similar but built with Rust's strengths in performance and safety.
We often rely on databases – powerful, persistent, feature-rich workhorses. But what happens when your dataset comfortably fits in your server's RAM, and the absolute lowest query latency is paramount? Fetching from disk, even SSDs, or hopping over the network introduces overhead. Milliseconds matter. Microseconds matter.
I wanted to explore building a library that treated RAM as the primary storage medium, specifically for enabling rapid querying of structured data. Think of it like giving your application its own super-fast, internal search engine for its live data, tailored to its specific needs.
The Core Idea: Smarter Indexing in Memory
The fundamental concept remains: keep your data directly accessible in memory, but build smart indexes on top of it. Just like a database uses indexes, Bulwark needed efficient ways to find data based on specific criteria without slow, linear scans.
This meant not only choosing the right data structures but also thinking about how they'd be used together:
- Equality (
user_id == 123
): Concurrent hash maps (likelockfree::map
) are still king here for average O(1) lookups. - Ranges & Sorting (
timestamp > T
,ORDER BY date
): Concurrent skip lists (xs_collections::skiplist
) provide great logarithmic performance for ranges and naturally support sorted retrieval with good read parallelism. - Text Searching (
name STARTS WITH "prefix"
,log CONTAINS "error"
): Radix and Suffix trees (specifically variations built on Adaptive Radix Trees fromxs_trees
) handle these specialized string searches efficiently. - Low Cardinality Fields (
status IN (1, 2)
,is_active == true
): This is where things got more interesting recently. For fields with only a few distinct values (like status codes, flags, enums), Bitmap indexes (roaring::RoaringBitmap
based) are incredibly powerful. They represent sets of matching items using compressed bitmaps, allowing for lightning-fast boolean logic (AND, OR) using cheap bitwise operations. Finding all "active VIP users" (status == active AND user_type == vip
) can become almost trivial if both attributes have bitmap indexes.
Designing the "Feel": The Evolved API
The goal was an intuitive flow. While the core setup remains similar, the querying part has evolved to offer more flexibility:
1. Define Your Data: Regular Rust structs. Now requires Clone
for querying.
// Note: Clone is now needed for querying
#[derive(Debug, Clone, Eq, PartialEq, Hash, Ord)]
struct Record { id: u64, label: String, value: i32, status_code: u8, tags: Vec<String> }
2. Create a Collection:
use bulwark::IndexedCollection; // Assuming re-exported
use bulwark::stores::HashedItemStore;
let data_store = IndexedCollection::<Record, HashedItemStore<Record>>::new_hashed("my_records".to_string());
3. Define Attributes: Tell Bulwark how to get data using Arc<Attribute<It>>
.
use bulwark::{Attribute, types::Value};
use std::sync::Arc;
let label_attr = Arc::new(Attribute::new("label", |r: &Record| Value::from_string(&r.label)));
let value_attr = Arc::new(Attribute::new("value", |r: &Record| Value::Int32(r.value)));
let status_attr = Arc::new(Attribute::new("status", |r: &Record| Value::UInt8(r.status_code)));
4. Build Indexes: Choose the right index type.
use bulwark::indexes::{hash_index, concurrent_skiplist_index, bitmap_index};
data_store.add_index(hash_index(label_attr.clone()))?; // Equality
data_store.add_index(concurrent_skiplist_index(value_attr.clone()))?; // Ranges/Sorting
data_store.add_index(bitmap_index(status_attr.clone()))?; // Fast boolean logic
5. Add Data:
data_store.add(Record { id: 1, label: "important".into(), value: 150, status_code: 1, tags: vec![] })?;
// ... add more records ...
6. Querying - Prepare, then Execute: The core query flow now involves preparing the query first, which allows choosing the result format.
use bulwark::queries; // Query builders
use bulwark::options::{QueryOptions, SortOption};
use bulwark::{ItemResults, QueryResultIter, Error}; // Result types
// Build the query expression
let my_query = queries::and(vec![
queries::equal_to(label_attr.clone(), Value::from_str("important")),
queries::greater_than(value_attr.clone(), Value::Int32(100)),
queries::or(vec![ // Planner might use Bitmap OR here
queries::equal_to(status_attr.clone(), Value::UInt8(1)),
queries::equal_to(status_attr.clone(), Value::UInt8(2))
])
]);
// --- Prepare the Query ---
let prepared_query = data_store.prepare_query(my_query);
// --- Execute for Materialized Results ---
match prepared_query.clone().run() { // Clone if using prepared query again
ItemResults::Collection(items) => {
println!("Found {} matching records (materialized).", items.len());
// Process Vec<Arc<Record>>
}
ItemResults::None => println!("No matching records found (materialized)."),
ItemResults::Error(e) => eprintln!("Materialized query error: {}", e),
}
// --- Execute for Streaming Results (Requires `streaming` feature) ---
#[cfg(feature = "streaming")]
{
println!("Streaming results:");
let stream: QueryResultIter<Record, _> = prepared_query.stream(); // Consumes prepared_query
for result in stream {
match result {
Ok(record_arc) => println!(" -> Streamed Record ID: {}", record_arc.id),
Err(e) => { eprintln!(" -> Stream Error: {}", e); break; }
}
}
}
// --- Convenience methods still exist ---
// let results_vec = data_store.query(query); // Materialized shortcut
// #[cfg(feature = "streaming")]
// let results_stream = data_store.query_stream(query); // Streaming shortcut
The Journey & Evolving Challenges
Beyond just picking structures, the real fun (and hair-pulling) involved:
- Concurrency: Making reads fast while writes happen is hard. We settled on
parking_lot::RwLock
around index structures, allowing many readers or one writer, combined withlockfree::map
for core storage. Keeping statistics (like counts or min/max values for skip lists) updated concurrently required careful use of atomics and locks. - Query Planning (CBO Lite): Initially, the query execution was simple. But how do you optimally run
A == 1 AND B > 100
? Do you find allA==1
first then filter by B, or vice-versa? What if A has a Hash index and B has a SkipList? What if both have Bitmap indexes? Bulwark now has a basic Cost-Based Optimizer. It gathers simple statistics from indexes (counts, distinct values, min/max) and estimates the "cost" (a blend of CPU and number of results) of using different indexes or different join strategies (Hash Intersection, Index Probing, Sorted Merge, Bitmap operations). It then picks the plan it thinks will be fastest. It's rudimentary, needing better stats (histograms!), but a huge step up from naive execution. - Ergonomics & Error Handling: Still a balancing act. We landed on the
PreparedQuery
pattern for flexibility. Materialized results returnItemResults
(Collection/None/Error). The streamingQueryResultIter
yieldsResult<Arc<It>, Error>
, allowing per-item error handling during iteration. - Lifetimes: Always a journey in Rust! Refactoring index query methods to return iterators while managing lock guard lifetimes required careful design, sometimes necessitating eager collection within specific iterator steps (like index scans) as a pragmatic compromise to satisfy the borrow checker when true lock-holding streaming wasn't feasible.
- Streaming Execution: Implementing the streaming query execution path (enabled via the
streaming
feature flag) was a major recent step. It uses an internal state machine (ExecutionIter
) to process the query plan lazily, yielding keys one by one. The publicQueryResultIter
then wraps this, hydrating keys intoArc<It>
on demand. This significantly reduces peak memory usage for queries returning large numbers of items, though sorting still requires internal materialization.
Where It Stands Now
Bulwark is significantly more capable than when I started. It handles diverse query types, uses concurrent structures, makes basic optimization choices, leverages specialized Bitmap indexes, and now offers both materialized and memory-efficient streaming query results. The API provides flexibility through the PreparedQuery
object.
Of course, there's always more: implementing true histograms for the CBO, adding more index types or query operators, further optimizing concurrent statistics, refining the streaming logic for range scans... the list goes on. But it's been a fantastic learning experience in building a specialized, high-performance data tool in Rust. Sometimes, you don't need a giant database; you just need a well-aimed, in-memory Bulwark, now with the option to stream.