Skip to content

Therismos

θερισμόςGreek; noun — Harvest.

A Python library for modeling queries, filters, expressions, grouping, and aggregations as object structures.

Features

  • Backend-agnostic modeling: Build expressions, filters, sorting, and aggregations independent of any specific backend
  • Declarative DSL: Natural Python syntax for building complex queries
  • Type safety: Optional field type declarations with automatic casting
  • Immutable structures: All nodes are immutable and thread-safe
  • Automatic normalization: Compound expressions are automatically flattened
  • Powerful optimizer: Detects contradictions, tautologies, and simplification opportunities
  • Grammar-based serialization: Convert expressions to/from compact strings for URLs and APIs
  • Visitor pattern: Extensible architecture for converting to any backend format — MongoDB, Polars, pandas, SQLAlchemy, SQLModel, and custom backends
  • Optimization tracking: Optional tracking of all optimization transformations
  • Sorting specifications: Model sort criteria as objects with optimization and visitor support
  • Grouping and aggregation: Model grouping and aggregation criteria as objects
  • Expression templates: Parameterized, persistable filter expressions with named placeholders and a transform pipeline DSL
  • Field pruning and projection: Remove or project field-based constraints with polarity-aware semantics
  • Structural equality: All expression types support == and hashing for use in sets, dicts, and equality-based testing

Quick Start

from therismos import F, optimize

age = F("age", int)
name = F("name")
status = F("status")

# Build expressions using natural Python syntax
expr = (age > 18) & (name == "Alice") | (status == "admin")

# Optimize the expression
optimized, records = optimize(expr)

# Detect contradictions automatically
contradiction = (age < 30) & (age > 40)
result, _ = optimize(contradiction)
# result is FALSE

# Aggregate OR equality chains
multi_status = (status == "active") | (status == "pending") | (status == "completed")
result, _ = optimize(multi_status)
# result is: status IN ("active", "pending", "completed")