Sunday, 15 July 2012

python - Produce a summary ("pivot"?) table -



python - Produce a summary ("pivot"?) table -

i'd way summarise database table rows sharing mutual id summarised 1 row of output.

my tools sqlite , python 2.x.

for example, given next table of fruit prices @ local supermarkets...

+--------------------+--------------------+--------------------+ |fruit |shop |price | +--------------------+--------------------+--------------------+ |apple |coles |$1.50 | |apple |woolworths |$1.60 | |apple |iga |$1.70 | |banana |coles |$0.50 | |banana |woolworths |$0.60 | |banana |iga |$0.70 | |cherry |coles |$5.00 | |date |coles |$2.00 | |date |woolworths |$2.10 | |elderberry |iga |$10.00 | +--------------------+--------------------+--------------------+

... want produce summary table showing me cost of each fruit @ each supermarket. blank spaces should filled nulls.

+----------+----------+----------+----------+ |fruit |coles |woolworths|iga | +----------+----------+----------+----------+ |apple |$1.50 |$1.60 |$1.70 | |banana |$0.50 |$0.60 |$0.70 | |cherry |null |$5.00 |null | |date |$2.00 |$2.10 |null | |elderberry|null |null |$10.00 | +----------+----------+----------+----------+

i believe literature calls "pivot table" or "pivot query", apparently sqlite doesn't back upwards pivot. (the solution in question uses hardcoded left joins. doesn't appeal me because don't know "column" names in advance.)

right iterating through entire table in python , accumulating dict of dicts, bit klutzy. open improve solutions, either in python or sqlite, give info in tabular form.

on python side, utilize itertools magic rearranging data:

data = [('apple', 'coles', 1.50), ('apple', 'woolworths', 1.60), ('apple', 'iga', 1.70), ('banana', 'coles', 0.50), ('banana', 'woolworths', 0.60), ('banana', 'iga', 0.70), ('cherry', 'coles', 5.00), ('date', 'coles', 2.00), ('date', 'woolworths', 2.10), ('elderberry', 'iga', 10.00)] itertools import groupby, islice operator import itemgetter collections import defaultdict stores = sorted(set(row[1] row in data)) # splitting in multiple lines more readable pivot = ((fruit, defaultdict(lambda: none, (islice(d, 1, none) d in data))) fruit, info in groupby(sorted(data), itemgetter(0))) print 'fruit'.ljust(12), '\t'.join(stores) fruit, prices in pivot: print fruit.ljust(12), '\t'.join(str(prices[s]) s in stores)

output:

fruit coles iga woolw apple 1.5 1.7 1.6 banana 0.5 0.7 0.6 cherry 5.0 none none date 2.0 none 2.1 elderberry none 10.0 none

python sqlite

No comments:

Post a Comment