Skip to content

pandas Backend

Installation

pip install therismos[pandas]

Usage

import pandas as pd
from therismos import F
from therismos.sorting import SortSpec, SortCriterion, SortOrder
from therismos.grouping import GroupSpec, Aggregation, AggregationFunction
from therismos.expr.visitors.pandas import PandasExprVisitor
from therismos.sorting.visitors.pandas import PandasSortSpecVisitor
from therismos.grouping.visitors.pandas import PandasGroupSpecVisitor

df = pd.DataFrame({
    "age": [20, 15, 30],
    "status": ["active", "inactive", "active"],
    "price": [10.0, 20.0, 15.0],
    "category": ["A", "B", "A"],
})

# Filter — PandasExprVisitor returns a PandasFilter callable
age = F("age")
status = F("status")
expr = (age > 18) & (status == "active")

mask = expr.accept(PandasExprVisitor())
df[mask(df)]

# Sort
spec = SortSpec([SortCriterion("age", SortOrder.DESCENDING)])
sort = spec.accept(PandasSortSpecVisitor())
df.sort_values(by=list(sort.by), ascending=list(sort.ascending))

# Group and aggregate
group_spec = GroupSpec(
    group_by=["category"],
    aggregations=[
        Aggregation("count", AggregationFunction.COUNT),
        Aggregation("avg_price", AggregationFunction.AVERAGE, "price"),
    ],
)
grp = group_spec.accept(PandasGroupSpecVisitor())
df.groupby(list(grp.group_by)).agg(**grp.agg)

PandasSortSpec is a frozen dataclass with by: tuple[str, ...] and ascending: tuple[bool, ...].

PandasGroupSpec is a frozen dataclass with group_by: tuple[str, ...] and agg: dict[str, pd.NamedAgg].

Note: COUNT requires at least one field in group_by. Using COUNT with an empty group_by raises ValueError.