CSV-Optimized Semantic Search
with Built-in Caching

Process million-row CSVs on 1GB RAM. 5-10x faster ingestion. 10x faster cached queries. Single 22MB binary. Zero dependencies.

🚀 The fastest way to search Shopify catalogs, product databases, and CSV datasets

Download v2.1 Now See Benchmarks

Revolutionary CSV Performance

Version 2.0 brings massive optimizations for CSV ingestion and search

⚡

Parallel Embedding Generation NEW

Process 100 rows concurrently using Rust futures. 5-10x throughput improvement. Near-linear scaling on multi-core systems (4 cores = 1,400 rows/sec).

🔍

Delta Detection NEW

SHA256 content hashing automatically skips unchanged rows. Re-upload your CSV in under 1 second. Perfect for daily product catalog updates.

💾

Memory-Efficient Batching NEW

Process million-row CSVs on 1GB RAM. 1000-row batches with pre-allocated vectors. Runs on AWS free tier (t2.micro).

🎯

Built-in LRU Cache NEW

Per-user cache with 60s TTL. 100 queries per user. 10x faster repeated searches. 70-80% cache hit rate in production.

📊

Real-Time Metrics NEW

Monitor ingestion throughput (rows/sec, MB/sec). Track cache hit rates. Detailed performance logging for optimization.

🔧

Tunable Performance

Adjust batch sizes for your hardware. Optimize for throughput or memory. Production-tested configurations for t2.micro to c5.xlarge.

🧠

Hybrid Search Engine

BM25 keyword search + vector semantic search. Optimized RRF fusion weights (3.0x + 1.5x). Enhanced reranking with diversity boosting.

📦

Zero Dependencies

Single 22MB binary (37% smaller). No Python, Docker, or databases. Download → Extract → Run. Works on macOS and Linux.

Performance Breakthrough

v1.0 vs v2.0: The Numbers Don't Lie

Metric	v1.0	v2.0	Improvement
CSV Ingestion (10K rows)	5 minutes	30 seconds	10x faster
Throughput	40-60 rows/sec	343-355 rows/sec	6-9x faster
Memory (100K rows)	3.2 GB	900 MB	70% reduction
Re-upload (unchanged)	30 seconds	<1 second	100x faster
Search (cached)	50-80ms	5-10ms	10x faster
Binary Size	35 MB	22 MB	37% smaller
Million-row CSV	OOM crash	50 min (stable)	Now possible!

          ⚡ Benchmark Highlights
          
              AWS t2.micro (1 vCPU, 1GB RAM): 343-355 rows/sec,
              stable for 1M rows
            
              AWS t2.medium (2 vCPUs, 4GB RAM): 700 rows/sec
              (2x scaling)
            
              AWS c5.xlarge (4 vCPUs, 8GB RAM): 1,400 rows/sec
              (4.1x scaling)
            
              Cache hit rate: 70-80% typical, queries return in
              5-10ms
            
              Concurrent users: 100% success rate with 50
              concurrent users

Real-World Impact

🛍️

E-Commerce Product Catalog

Shopify store with 50,000 products, daily updates to 5% of inventory

Before (v1.0):
• Initial import: 20 minutes
• Daily updates: 20 minutes

After (v2.0):
• Initial import: 2.5 minutes (8x faster)
• Daily updates: 45 seconds (27x faster)
• Annual time saved: ~120 hours
• Cost reduction: 83% ($50/mo → $8.50/mo)

✓ Delta detection means only changed products are reindexed
✓ Cache accelerates repeat searches for popular products

📰

Content Management System

News site with 1 million articles, 1,000 new articles per day

Before (v1.0):
• Initial index: 27 hours
• Daily updates: 40 minutes

After (v2.0):
• Initial index: 50 minutes (32x faster)
• Daily updates: 3 minutes (13x faster)
• Search capacity: 35 queries/sec (vs 12)
• Hardware cost: $8.50/mo (vs $50/mo)

✓ Memory efficiency allows processing on free-tier AWS
✓ Cache delivers instant results for trending searches

📡

IoT Sensor Data

10,000 sensors, CSV export every hour with 100K readings

Before (v1.0):
• Processing: 5 minutes
• Hardware: t2.medium ($35/mo)

After (v2.0):
• Processing: 4.5 minutes (10% faster)
• Delta detection: Skip unchanged readings
• Hardware: t2.micro ($8.50/mo)
• Cost reduction: 76%

✓ Memory optimization allows smaller instance
✓ Stable processing for continuous data streams

Get Started in 60 Seconds

No complex setup. No dependencies. Just download and run.

            # 1. Download and extract
            tar
            -xzf vectis_v2.0_optimized.tar.gz
            cd vectis_v2.0_optimized

            # 2. Configure (optional - works with defaults)
            cp .env.example .env
            nano .env
            # Set JWT_SECRET and FRONTEND_ORIGIN

            # 3. Start server
            ./vectis serve
            🚀 Vectis search engine running on http://0.0.0.0:3000

            # 4. Register and login (in new terminal)
            ./vectis
            register --username admin --password
            yourpassword ./vectis
            login --username admin --password
            yourpassword
            {"token": "eyJhbGc..."}

            # 5. Upload CSV (with delta detection!)
            ./vectis
            upload --file products.csv --token
            YOUR_TOKEN
            📊 Processing 10000 new/changed CSV rows (skipped 0
              unchanged)
            ⚡ Generated 10000 embeddings in 28760ms (347.83 rows/sec)

            # 6. Search your data
            ./vectis
            search --query "gaming laptop" --token
            YOUR_TOKEN

            # 7. Run benchmarks
            ./scripts/benchmark_csv_ingestion.sh \
            --csv-path products.csv \ --iterations 3
          

Unbeatable Value

Free to download. No license fees. No API costs.
Deploy on your own infrastructure.

Single 22MB binary - no installation complexity
Zero dependencies - works on any Linux/macOS system
Process million-row CSVs on $8.50/month hardware (AWS t2.micro)
Built-in caching saves 10x on repeated queries
Delta detection eliminates redundant processing
Multi-tenant with per-user isolation
SQLite auth persistence - survives restarts
Comprehensive documentation and benchmarking tools
Production-ready with stress testing validation

Cost savings example: Replace $50/month infrastructure with $8.50/month

Ready to 10x Your CSV Search Performance?

Download Vectis v2.1 now and experience the fastest CSV semantic search engine. Join hundreds of developers processing millions of rows with ease.

Download v2.1 Now (Free)

Available for macOS (Intel/ARM) and Linux (x86_64) • 22MB download

Technical Specifications

Architecture

Language: Rust (async/Tokio)
Framework: Axum HTTP server
Vector DB: LanceDB (columnar, HNSW)
Text Search: Tantivy (BM25)
Embeddings: BGE-Small-EN-v1.5 (384-dim)

Requirements

OS: macOS, Linux (x86_64)
RAM: 1GB minimum, 4GB recommended
Disk: 500MB + data storage
CPU: 1 core min, 2+ recommended

Security

Auth: JWT + bcrypt (cost 10)
Isolation: Email-based table separation
Cache: Per-user LRU (100 queries, 60s TTL)
Storage: SQLite auth persistence

CSV-Optimized Semantic Searchwith Built-in Caching