This package provides a number of utility functions useful for manipulating large histograms. This includes methods to trim, subset, merge buckets, merge histograms, convert to CDF, and calculate information loss due to binning. It also provides a protocol buffer representation of R's native histogram class to allow histograms over large data sets to be computed and combined in distributed analytical pipelines.
You can either install from source via this repo, or install the CRAN package the usual way from R.
RProtoBuf & HistogramTools: Statistical Analysis Tools for Large Data Sets Google Open Source Blog, October 10, 2013
This package was originally developed at Google between 2011 and 2015. It is now independently maintained by the original author.
Murray Stokely
Apache 2.0