I used to work on Datacenter Monitoring circa 2015. I automated the heck out of my job. One of the last projects I delivered there was a Distributed Real Time Monitoring and Alerting Platform, written in Java. I will build it once again with the help of AI in Rust.
It was an incredibly complex system with 5 different roles, technically 4 layers. I spent around 6 months building the system and was incredibly proud of it. It worked like a charm, very fast, high performance, exceeded my wildest dreams, when I demoed it to the team that was going to use it. I don't remember practicing the part where a team member took down a unused link on a datacenter device so that the system would generate the email or pager, but shit it worked. That is innovation.
The system allows users to define rules and it was crazy to me how fast it can go with UDFs.
It processed billions of data points per day coming from datacenter devices, obviously collected via another software, I built 😏, that was automated. The interesting part was that I used websockets instead of regular TCP to do all communication inter, intra layer and datacenter. I think I prioritized cluster communication considering they are high priority. I only wanted to use one port for http and streaming. I know http2 and 3 is a thing now, but at the time this made the most sense to me. I'd still use web sockets still.
I had AI look at multiple strategies on how to structure the system and surprisingly, it said what I did was balanced enough (table below). Like there is no other improvement it can think of for such a system. I really did my job well there. High throughput, high availability with Cross Datacenter communication. It was a very insular system, it had to operate even if dependencies are down. It required low maintenance. You definitely don't see many systems giving you that or many people in the world being able to deliver that on their own. I looked at research papers and conference talks sucking up as much knowledge as possible.
Structure | Throughput Potential | HA Potential | Complexity | Insular Processing Adherence | "Brain" Role Clarity | Notes |
Baseline | High | High (requires HA mgmt) | High | Good | Good | Strong contender, balances concerns well. mgmt HA is key. |
In pursuing a rust version, I will use the Excerion Sun Messaging Backbone or XSMB for this new project. It includes RPC, Pub/Sub, clustering using P2P discovery already. All provided via ZeroMQ. I've talked about the performance of this backbone before. Extremely high performance for internal communications. The backbone I'll open source and publicly publish as soon as I hit 1.0.0 with it. I'm confident in how stable and easy it is to use today, but the clustering needs more work I think.
With the knowledge I have and the incredible amount of power AI has, to deliver a much MUCH MORE efficient version.