Topics

Decoupling Models From Mongo

Trevor.Conn@...
 

Hi all -- As part of the effort to separate EdgeX's internal and external model representations, we've made good progress in restructuring the project packages to clearly delineate which code is publicly exportable and which code is usable only my EdgeX internals. Currently, all application logic -- both internal and external -- is pointed to the externally available models located in pkg/models. The long term goal is to change that so that EdgeX has an internal representation of its model which can be changed independently of the contract model.


One of the challenges I've been wrestling with is the coupling we have to Mongo. Our models currently specify BSON serialization metadata (something no client should ever care about) as well as BSON.ObjectID properties and our application logic manages the creation and assignment of BSON.ObjectIDs rather than the underlying database platform. In the future state, if we'd like for the database to be swappable, we have to eliminate explicit knowledge of how to manage database keys and push any metadata about translating a type to a database representation down into the database layer.


Fede has made a good start down a path that will facilitate cleaner separation with his types in the db/mongo package, but there's more work that needs to be done. I want to summarize some points that I have in mind based on a day and a half of wrestling with how to achieve a clean separation between our business logic and the requirements of the underlying storage platform. These should not be taken as decisions, but visibility into what I'm currently thinking and an open request for comment.


  • Push all key and referential integrity management down to the database
    • This means no object should expose the internal primary key/_id value to the application above the database layer (like db/mongo above)
    • This means we standardize on a user-specified UUID which identifies entries and is the primary means by which we perform lookups from the application layer
    • This means the given UUID field/column needs to be explictly indexed
      • Mongo example: db.event.createIndex({uuid: 1}, {name: "ix_event_uuid"})
    • This eliminates the need to force myriad, native database key datatypes into a generic representation (like string) for later verification and casting.
    • No application logic should ever be responsible for assigning these values
  • For models at the database layer (like db/mongo above), suggest we use "Id" for the property containing the actual database key value
  • For models at all other layers (which will NOT have the above property) suggest we use "Key" as a string containing the user-defined UUID
  • In cases where it is necessary for the database layer to obtain IDs of newly inserted records, we must require that the "driver" used by EdgeX is capable of providing this functionalty.
    • Example use case: see the AddEvent function
      • Our current mongo driver (gopkg.in/mgo.v2/bson) is incapable of returning the IDs of the newly inserted records for the readings, and so we must create those IDs in the application logic so that the DBRefs are aligned correctly when the event is stored.
      • If, however, we used the official Mongo go provider we would be able to obtain the IDs of newly inserted records due to support for Mongo's insertOne and insertMany capabilities.
    • NOTE: Our current mgo.v2 driver is no longer maintained, so we need to switch anyway!!!
Thanks for your attention in reading this. I'm not going to proceed any further down this path until I can get some feedback and clarity on the above points. As I said at the beginning, there are real benefits to the work done so far and once the export services have been re-organized to fit in the new package structure, we'll be in much better shape. But there's more work to be done and the primary intent of the original issue I created hasn't been addressed yet.

To ensure we follow through on making sure EdgeX is flexible enough to operate in a customer-defined environment, we have to decouple vendor-specific logic and types from the main application.


Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

espy
 

On 7/20/18 12:22 PM, Trevor.Conn@... wrote:

Hi all -- As part of the effort to separate EdgeX's internal and external model representations, we've made good progress in restructuring the project packages to clearly delineate which code is publicly exportable and which code is usable only my EdgeX internals. Currently, all application logic -- both internal and external -- is pointed to the externally available models located in pkg/models. The long term goal is to change that so that EdgeX has an internal representation of its model which can be changed independently of the contract model.


One of the challenges I've been wrestling with is the coupling we have to Mongo. Our models currently specify BSON serialization metadata (something no client should ever care about) as well as BSON.ObjectID properties and our application logic manages the creation and assignment of BSON.ObjectIDs rather than the underlying database platform. In the future state, if we'd like for the database to be swappable, we have to eliminate explicit knowledge of how to manage database keys and push any metadata about translating a type to a database representation down into the database layer.


Fede has made a good start down a path that will facilitate cleaner separation with his types in the db/mongo package, but there's more work that needs to be done. I want to summarize some points that I have in mind based on a day and a half of wrestling with how to achieve a clean separation between our business logic and the requirements of the underlying storage platform. These should not be taken as decisions, but visibility into what I'm currently thinking and an open request for comment.


  • Push all key and referential integrity management down to the database
    • This means no object should expose the internal primary key/_id value to the application above the database layer (like db/mongo above)
    • This means we standardize on a user-specified UUID which identifies entries and is the primary means by which we perform lookups from the application layer
+1
    • This means the given UUID field/column needs to be explictly indexed
      • Mongo example: db.event.createIndex({uuid: 1}, {name: "ix_event_uuid"})
Are their penalties for for having more than field being indexed?

    • This eliminates the need to force myriad, native database key datatypes into a generic representation (like string) for later verification and casting.
    • No application logic should ever be responsible for assigning these values
This likely means we need constructors for all of our external objects which use a common newGUUID() function.

  • For models at the database layer (like db/mongo above), suggest we use "Id" for the property containing the actual database key value
  • For models at all other layers (which will NOT have the above property) suggest we use "Key" as a string containing the user-defined UUID
I don't love this, as key is usually used in the context of SQL-like databases.  I think the GUIID should be identified as "Id", and internal DB keys should use "key" or "dbId", or something along those lines.
  • In cases where it is necessary for the database layer to obtain IDs of newly inserted records, we must require that the "driver" used by EdgeX is capable of providing this functionalty.
    • Example use case: see the AddEvent function
      • Our current mongo driver (gopkg.in/mgo.v2/bson) is incapable of returning the IDs of the newly inserted records for the readings, and so we must create those IDs in the application logic so that the DBRefs are aligned correctly when the event is stored.
      • If, however, we used the official Mongo go provider we would be able to obtain the IDs of newly inserted records due to support for Mongo's insertOne and insertMany capabilities.
    • NOTE: Our current mgo.v2 driver is no longer maintained, so we need to switch anyway!!!
+1

While listening to the Mongo presentation yesterday I decided to take a look at the license of mgo.v2 (written by one of my co-workers), and noticed it's *unsupported*.

Thanks for your attention in reading this. I'm not going to proceed any further down this path until I can get some feedback and clarity on the above points. As I said at the beginning, there are real benefits to the work done so far and once the export services have been re-organized to fit in the new package structure, we'll be in much better shape. But there's more work to be done and the primary intent of the original issue I created hasn't been addressed yet.

To ensure we follow through on making sure EdgeX is flexible enough to operate in a customer-defined environment, we have to decouple vendor-specific logic and types from the main application.
Thanks for the write-up Trevor, some really good stuff here.

Regards,
/tony




Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA