I created a Time Series Database for private use and am also looking to figure out ways to decrease on/off memory space and simplify things.
Typically a time series database will store values into an inflexible 64bit number such as a long integer or double precision float. I did that at first just to get something done, but branched out into having public types separated from internal types. The idea instead is to expose things like percent, normalized value, integer, rational, etc. I would map those into internal storage types that could take advantage of space savings.
The advantage of this separation is that with public types, I can turn down the resolution effectively establishing min/max values so I can realize said space savings. Trying to come up with a general data container/algorithm is an terribly inefficient idea when users can declare their value type instead.
Recording cpu percentage does not require a 64bit integer, that's crazy. 0-100 and some decimal places is all that matters. The purpose of recording said percentage is really for seeing measurable changes such as integer changes not really decimal changes in value, but we are hoomans and like to see values go up and down so a few decimal places is fine for a percentage value.
A normalized value goes from 0.0 - 1.0. There's more decimal places that's allowed for it, but at some point the decimal places becomes meaningless, so a 64bit integer isn't needed as well. If one needs to do scientific calculations, a tsdb isn't where you are going to need crazy precision.
I plan to support on/off state storage so essentially storing a series of 0 and 1s efficient. Best case scenario is that the series of 0s or 1s are constant as that requires the least storage space. I imagine storing up and down states based on an uptime check.
I always wanted the on/off state series, but never bothered to do it anywhere I worked at because it was inefficient. Imagine needing only 1 bit and finding out that 1 bit expands to 64 bits (64x space increase) in the db. Doesn't make sense especially when database engineers, at the time, complain about teams wasting space and looked to calculate costs for every team.
I want to come up with other types that don't require much space such as a Short/Half Decimal. CPU load and temperature are decimal, but don't require a crazy amount of space such as a 64 bit. I mean honestly, does one need more than a +-1,024 integer with a few decimal places? A 16bit numberĀ (4xsavings) is good enough for these kinds of things.
I don't plan to add ability to do functions but will instead
leave it to a UI to do all of that. A UI can cache values and do math,
so why do it on the server side. I get it if mobile device, but those
devices are getting so much better and WASM is efficient and portable. If anything,
the values can be pushed into something like DuckDB/Apache Arrow and functions can be
run there. I no longer see the point of implementing all of that on the server side. Obviously means that this won't use Grafana to visualize which I am totally fine with. A new visualization UI is needed for what I want and it is Rust based.
My ideal time series database has always been, for the past 8 or so years, the kind where I can not give as much of a sh*t about the space of the data type I record but rather only the patterns. If the data is (near) constant most of the time, there should be a space benefit, if the data changes but not much then it should not take much space, but if the data changes wildly then there should not be much expectation in the space savings.