pandas Backend¶
Installation¶
Usage¶
import pandas as pd
from therismos import F
from therismos.sorting import SortSpec, SortCriterion, SortOrder
from therismos.grouping import GroupSpec, Aggregation, AggregationFunction
from therismos.expr.visitors.pandas import PandasExprVisitor
from therismos.sorting.visitors.pandas import PandasSortSpecVisitor
from therismos.grouping.visitors.pandas import PandasGroupSpecVisitor
df = pd.DataFrame({
"age": [20, 15, 30],
"status": ["active", "inactive", "active"],
"price": [10.0, 20.0, 15.0],
"category": ["A", "B", "A"],
})
# Filter — PandasExprVisitor returns a PandasFilter callable
age = F("age")
status = F("status")
expr = (age > 18) & (status == "active")
mask = expr.accept(PandasExprVisitor())
df[mask(df)]
# Sort
spec = SortSpec([SortCriterion("age", SortOrder.DESCENDING)])
sort = spec.accept(PandasSortSpecVisitor())
df.sort_values(by=list(sort.by), ascending=list(sort.ascending))
# Group and aggregate
group_spec = GroupSpec(
group_by=["category"],
aggregations=[
Aggregation("count", AggregationFunction.COUNT),
Aggregation("avg_price", AggregationFunction.AVERAGE, "price"),
],
)
grp = group_spec.accept(PandasGroupSpecVisitor())
df.groupby(list(grp.group_by)).agg(**grp.agg)
PandasSortSpec is a frozen dataclass with by: tuple[str, ...] and ascending: tuple[bool, ...].
PandasGroupSpec is a frozen dataclass with group_by: tuple[str, ...] and agg: dict[str, pd.NamedAgg].
Note:
COUNTrequires at least one field ingroup_by. UsingCOUNTwith an emptygroup_byraisesValueError.