A common use case: Let's assume you have an accounts table and you want to know when an account was created or last edited. or you want a log table that stores created time.
First thing I would think of is what fields and what data type do I need? created_on BIGINT and updated_at BIGINT.
Well, why not use TIMESTAMP instead?
- The year 2038 problem! Only 20 years t o go. A MySQL task is create d to track expanding the field. I'm sure it will get solved via database update, but until then... yeah.
- I store milliseconds in database by default and TIMESTAMP doesn't support that as far as I know, call it cargo culting!
- I like comparing numbers rather than relying on a timestamp data type.
There is nothing wrong with using TIMESTAMP and it comes down to personal preference. It makes absolute sense to use TIMESTAMP as you can take advantage of MySQL's automatic initalization feature, which means the database manages these two fields and it is a happy day for a developer.
Back to BIGINT
So we have two fields that are bigint. Now I need to make the decision on whether this will be application or databased managed a.k.a. who is responsible for initializing and updating these fields? On one hand, if the application does it, then it is one single statement and no magic, but need to write code or use an ORM (i.e. Sequelize) to set these fields. One the other hand, if the database does it, then it is a TRIGGER statement and it is magic, but no application code needs to modify these fields.
Application Managed Timestamps
Usually, I would use application code to set these fields because I guess I like having full control over things, but then I remember if I am using many servers how likely are the servers times to drift from each other especially if one gets overloaded? It would be weird to see that two accounts did something sequentially but see that the timestamp rows differ drastically. Does the application really need to modify this data? It can read it if needed, but is there ever a need to modify it? Maybe in high update/insert scenarios, which may be 0.001% of cases, but will you be using MySQL then?
I think a distributed application would be hard to build with applications controlling this metadata. So you eschew application managed timestamps, what is left? TRIGGERS!
In comes Database Triggers
Triggers are lauded for slowing a database down (i.e here). It is understandable, but those performance concerns are from the old days (5.1) and MySQL is at 5.7 stable going up to 8.0 stable soon. I do not know how triggers were executed in the past, but the manual for MySQL and MariaDB show they execute within the same transaction.
Think about what triggers buy us for our use case: It is magic, follows a singular time from the system, we can extend the trigger to create logs for changes without changing application code. The downside is that if you do not use row-based replication then triggers need to be executed everywhere, which would be odd for using real time functions such as NOW().
Assume you use row-based replication, then yes, we follow a singular leader's time for these two fields. Let's look at the triggers for creating millisecond fields. Remember, our table is called accounts.
Creation Trigger
CREATE TRIGGER accounts__before_insert
BEFORE INSERT ON acounts
FOR EACH ROW SET NEW.created_on = ROUND(UNIX_TIMESTAMP(CURTIME(4)) * 1000), NEW.updated_at = NEW.created_on;
Update Trigger
CREATE TRIGGER accounts__before_update
BEFORE UPDATE ON accounts
FOR EACH ROW SET NEW.updated_at = ROUND(UNIX_TIMESTAMP(CURTIME(4)) * 1000);
I tried a couple of insert and updates and bam, things work as they should!
Conclusion
What have I learned? Save time and use triggers. It is unneccessary to push to application without profiling.
I thought about the high write scenario and think updating a cache would be better then flushing it every once in a while to MySQL would be a better option.