{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Columns\n", "\n", "[[Polars Documentation](https://docs.pola.rs/api/python/stable/reference/expressions/columns.html)]\n", "\n", "`pl.col(\"A\")` or `pl.col.A` starts an expression by selecting a column named \"A\".\n", "While the Polars Library allows for multiple columns to be selected simultaneously\n", "(via `pl.col(\"*\")`, `pl.col(\"A\", \"B\")`, `pl.col(pl.String)`, `pl.exclude`, and so on),\n", "the OpenDP Library currently only supports selection of one column at a time.\n", "The column name may be changed via `.alias`.\n", "\n", "Take for example the work hours dataset, where there are a collection of columns labeled `METHODX`, \n", "where `X` is an increasing alphabetic sequence." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import polars as pl\n", "import opendp.prelude as dp\n", "\n", "dp.enable_features(\"contrib\")\n", "\n", "# not recommended, OpenDP will reject this joint expression over multiple columns\n", "single_expr = pl.col([f\"METHOD{l}\" for l in \"ABCDE\"]).fill_null(0).dp.sum((0, 9))\n", "\n", "# build individual expressions for each query\n", "split_exprs = [pl.col(f\"METHOD{l}\").fill_null(0).dp.sum((0, 9)) for l in \"ABCDE\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Demonstration of use:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
METHODA | METHODB | METHODC | METHODD | METHODE |
---|---|---|---|---|
i64 | i64 | i64 | i64 | i64 |
1704484 | 1699390 | 1702886 | 1703232 | 1705356 |