Date   
Re: JSON Marshal vs Decode

Trevor.Conn@...
 

Well, we kind of have a method like that already.

 

https://github.com/edgexfoundry/go-mod-core-contracts/blob/4425df9b51ed055c673fdafb1a97590b6cd0f324/models/operatingstate.go#L53

 

Again, the problem is that everywhere operating state is used, we have to add this check.

 

Trevor

 

From: Goodell, Leonard <leonard.goodell@...>
Sent: Friday, March 8, 2019 10:50 AM
To: Conn, Trevor; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: RE: [Edgex-golang] JSON Marshal vs Decode

 

[EXTERNAL EMAIL]

Could add a ValidateIsSet() func to OperatingState that the parents calls so it is sill responsibility/encapsulated in OperatingState.

 

From: Trevor.Conn@... <Trevor.Conn@...>
Sent: Friday, March 08, 2019 9:40 AM
To: Goodell, Leonard <leonard.goodell@...>; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: RE: [Edgex-golang] JSON Marshal vs Decode

 

Lenny – Yes, it does appear that is the mechanism preventing the UnmarshalJSON from being called. But I have two follow ups:

 

  1. Why does it work when calling Marshal()? I suspect it’s because there isn’t a pointer in the receiver and OperatingState has an empty string value on the parent type whereas during Unmarshal there isn’t a pointer to reference.
  2. I’m not a fan of putting the knowledge of “check for empty string” in all the types that have OperatingState as a member when the whole point of making OperatingState a composite literal type is to encapsulate that check. It may as well be a regular string in that case.

 

Good discussion.

 

Trevor

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of Goodell, Leonard
Sent: Friday, March 8, 2019 10:20 AM
To: Conn, Trevor; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

[EXTERNAL EMAIL]

So the issue is the custom validation can’t run if the incoming JSON does exist for that object, which makes sense. Seems to me it is then the responsibility of the parent object’s custom validation to handle this case, i.e. simple empty value check.

 

In this case have UnmarshalJSON() in models/provisionwatcher.go should check a.OperatingState for empty string or constant representing ‘empty’ before setting it.

 

Lenny

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of Trevor.Conn@...
Sent: Thursday, March 07, 2019 4:58 PM
To: ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

We may be moving towards a better solution with Ian’s suggestion.

 

I have created a follow up branch here:

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug_ian

 

Only the contracts require changes in this case and, as Ian says below, the change is to add the Unmarshal function to the ProvisionWatcher where the receiver is a pointer. There’s still one problem though, suggestions are welcome.

 

This still doesn’t fix the bug. I can now step into the Unmarshal function, so we’re closer. But the related Unmarshal in the OperatingState is never called because it’s not supplied on the request – in either the ProvisionWatcher itself or the related DeviceService.

 

Here is the example request I am using to test. This comes from the core-metadata-importer collection from the blackbox tests.

 

POST http://localhost:48081/api/v1/provisionwatcher

{"_class":"org.edgexfoundry.domain.meta.ProvisionWatcher","name":"lonworks watcher test","identifiers":{"MAC":"00-05-1C-C1-99-99","HTTP":"10.0.0.4"},"origin":1471806386919,"profile":{"name":"variable speed motor profile"},"service":{"name":"home variable speed motor device service"}}

 

If you’re going to test this locally you will need to run the core-metadata-importer collection in your local environment first due to the Device Service relationship.

 

Trevor

 

P.S. Even with this, I still think having a more streamlined request makes sense. But if we can work this out, it could be a good middling step that we can implement for Edinburgh.

 

From: Ian Johnson <ian.johnson@...>
Sent: Thursday, March 7, 2019 5:04 PM
To: Conn, Trevor
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

[EXTERNAL EMAIL]

Hi Trevor, sorry for missing the call today so perhaps I'm missing some context here, but why doesn't the custom unmarshaller method get called for the type you are using? 

 

A problem I see with the solution you have in your branch is that you're unmarshalling the json and then you re-marshal it again just to get the custom error checking logic in your marshaller to run. This seems both inefficent and confusing.

 

I think that a better way to handle this is to re-define your custom Unmarshaller for a pointer to the type as that will cause it to run and be used. As you have it right now, it's I don't think it's being used because you call Decode you're using a pointer, and as such unmarshalling into a pointer, but the custom method you've defined is a method on a object directly and thus is ignored. However you will find if you try the trivial implementation for this Unmarshaller which is just to call json.Unmarshal on the data passed in to your custom unmarshaller method that you will enter an infinite loop. A good trick I've seen before is to use a type alias and an "anonymous struct" with struct embedding to run the generic unmarshaller on the data you're provided but in a way that it doesn't call the custom method you've defined. You can see an example of this here: https://gist.github.com/anonymouse64/bd0321890ce34ba39b92c536bc9080d0

I recommend this approach over the re-encoding method you have today.

 

Thanks,

Ian 

On Thu, Mar 7, 2019 at 12:43 PM <Trevor.Conn@...> wrote:

Hi all – I’m sending this email out as a follow up to today’s Core WG call. The recording of the call can be found on the wiki page and discussion pertinent to this topic starts at about the 30 minute mark.

 

https://wiki.edgexfoundry.org/display/FA/Core+Working+Group

 

The issue in a nutshell is that in large measure throughout the EdgeX Go services, we ingest and deserialize JSON requests in the following manner:

 

var e models.Event

dec := json.NewDecoder(r.Body)

err := dec.Decode(&e)

 

Or, alternatively stated

 

var pw models.ProvisionWatcher
json.NewDecoder(r.Body).Decode(&pw)

 

The mechanism used here to decode the request body from a stream containing JSON into a given type does NOT cause the custom unmarshaling logic present in many contracts to be called. This was proven and demo’ed on the call. In essence, this means we have an issue guaranteeing the integrity of requests received by all services in edgex-go that utilize the above pattern.

 

I presented a solution to this problem on the call. The depth to which we embrace the solution and the associated timeframe is TBD. If you wish to more closely review the code that I showed during the call, I have made that available via the following feature branches.

 

https://github.com/tsconn23/edgex-go/tree/marshal_bug

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug

 

If you are interested in this topic, I am happy to take any questions/comments via this email thread or Slack. Thanks.

 

Trevor Conn

Senior Principal Software Engineer

Dell Technologies | IoT DellTech

Trevor.Conn@...

Round Rock, TX USA

 

Re: JSON Marshal vs Decode

Goodell, Leonard
 

That method looks to be working off raw JSON where the method I suggest is on the object, not the JSON.

 

But this isn’t needed if you can get the “remove the omitEmpty tag” working. BTW, I looked at the structs and didn’t see any omitEmpty tag in use. Where is that specified?

 

Thanks!

   Lenny

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of Trevor.Conn@...
Sent: Friday, March 08, 2019 9:54 AM
To: Goodell, Leonard <leonard.goodell@...>; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

Well, we kind of have a method like that already.

 

https://github.com/edgexfoundry/go-mod-core-contracts/blob/4425df9b51ed055c673fdafb1a97590b6cd0f324/models/operatingstate.go#L53

 

Again, the problem is that everywhere operating state is used, we have to add this check.

 

Trevor

 

From: Goodell, Leonard <leonard.goodell@...>
Sent: Friday, March 8, 2019 10:50 AM
To: Conn, Trevor; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: RE: [Edgex-golang] JSON Marshal vs Decode

 

[EXTERNAL EMAIL]

Could add a ValidateIsSet() func to OperatingState that the parents calls so it is sill responsibility/encapsulated in OperatingState.

 

From: Trevor.Conn@... <Trevor.Conn@...>
Sent: Friday, March 08, 2019 9:40 AM
To: Goodell, Leonard <leonard.goodell@...>; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: RE: [Edgex-golang] JSON Marshal vs Decode

 

Lenny – Yes, it does appear that is the mechanism preventing the UnmarshalJSON from being called. But I have two follow ups:

 

  1. Why does it work when calling Marshal()? I suspect it’s because there isn’t a pointer in the receiver and OperatingState has an empty string value on the parent type whereas during Unmarshal there isn’t a pointer to reference.
  2. I’m not a fan of putting the knowledge of “check for empty string” in all the types that have OperatingState as a member when the whole point of making OperatingState a composite literal type is to encapsulate that check. It may as well be a regular string in that case.

 

Good discussion.

 

Trevor

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of Goodell, Leonard
Sent: Friday, March 8, 2019 10:20 AM
To: Conn, Trevor; ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

[EXTERNAL EMAIL]

So the issue is the custom validation can’t run if the incoming JSON does exist for that object, which makes sense. Seems to me it is then the responsibility of the parent object’s custom validation to handle this case, i.e. simple empty value check.

 

In this case have UnmarshalJSON() in models/provisionwatcher.go should check a.OperatingState for empty string or constant representing ‘empty’ before setting it.

 

Lenny

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of Trevor.Conn@...
Sent: Thursday, March 07, 2019 4:58 PM
To: ian.johnson@...
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

We may be moving towards a better solution with Ian’s suggestion.

 

I have created a follow up branch here:

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug_ian

 

Only the contracts require changes in this case and, as Ian says below, the change is to add the Unmarshal function to the ProvisionWatcher where the receiver is a pointer. There’s still one problem though, suggestions are welcome.

 

This still doesn’t fix the bug. I can now step into the Unmarshal function, so we’re closer. But the related Unmarshal in the OperatingState is never called because it’s not supplied on the request – in either the ProvisionWatcher itself or the related DeviceService.

 

Here is the example request I am using to test. This comes from the core-metadata-importer collection from the blackbox tests.

 

POST http://localhost:48081/api/v1/provisionwatcher

{"_class":"org.edgexfoundry.domain.meta.ProvisionWatcher","name":"lonworks watcher test","identifiers":{"MAC":"00-05-1C-C1-99-99","HTTP":"10.0.0.4"},"origin":1471806386919,"profile":{"name":"variable speed motor profile"},"service":{"name":"home variable speed motor device service"}}

 

If you’re going to test this locally you will need to run the core-metadata-importer collection in your local environment first due to the Device Service relationship.

 

Trevor

 

P.S. Even with this, I still think having a more streamlined request makes sense. But if we can work this out, it could be a good middling step that we can implement for Edinburgh.

 

From: Ian Johnson <ian.johnson@...>
Sent: Thursday, March 7, 2019 5:04 PM
To: Conn, Trevor
Cc: edgex-tsc-core@...; EdgeX-GoLang@...
Subject: Re: [Edgex-golang] JSON Marshal vs Decode

 

[EXTERNAL EMAIL]

Hi Trevor, sorry for missing the call today so perhaps I'm missing some context here, but why doesn't the custom unmarshaller method get called for the type you are using? 

 

A problem I see with the solution you have in your branch is that you're unmarshalling the json and then you re-marshal it again just to get the custom error checking logic in your marshaller to run. This seems both inefficent and confusing.

 

I think that a better way to handle this is to re-define your custom Unmarshaller for a pointer to the type as that will cause it to run and be used. As you have it right now, it's I don't think it's being used because you call Decode you're using a pointer, and as such unmarshalling into a pointer, but the custom method you've defined is a method on a object directly and thus is ignored. However you will find if you try the trivial implementation for this Unmarshaller which is just to call json.Unmarshal on the data passed in to your custom unmarshaller method that you will enter an infinite loop. A good trick I've seen before is to use a type alias and an "anonymous struct" with struct embedding to run the generic unmarshaller on the data you're provided but in a way that it doesn't call the custom method you've defined. You can see an example of this here: https://gist.github.com/anonymouse64/bd0321890ce34ba39b92c536bc9080d0

I recommend this approach over the re-encoding method you have today.

 

Thanks,

Ian 

On Thu, Mar 7, 2019 at 12:43 PM <Trevor.Conn@...> wrote:

Hi all – I’m sending this email out as a follow up to today’s Core WG call. The recording of the call can be found on the wiki page and discussion pertinent to this topic starts at about the 30 minute mark.

 

https://wiki.edgexfoundry.org/display/FA/Core+Working+Group

 

The issue in a nutshell is that in large measure throughout the EdgeX Go services, we ingest and deserialize JSON requests in the following manner:

 

var e models.Event

dec := json.NewDecoder(r.Body)

err := dec.Decode(&e)

 

Or, alternatively stated

 

var pw models.ProvisionWatcher
json.NewDecoder(r.Body).Decode(&pw)

 

The mechanism used here to decode the request body from a stream containing JSON into a given type does NOT cause the custom unmarshaling logic present in many contracts to be called. This was proven and demo’ed on the call. In essence, this means we have an issue guaranteeing the integrity of requests received by all services in edgex-go that utilize the above pattern.

 

I presented a solution to this problem on the call. The depth to which we embrace the solution and the associated timeframe is TBD. If you wish to more closely review the code that I showed during the call, I have made that available via the following feature branches.

 

https://github.com/tsconn23/edgex-go/tree/marshal_bug

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug

 

If you are interested in this topic, I am happy to take any questions/comments via this email thread or Slack. Thanks.

 

Trevor Conn

Senior Principal Software Engineer

Dell Technologies | IoT DellTech

Trevor.Conn@...

Round Rock, TX USA

 

Core Contract Model Validators

Trevor.Conn@...
 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.


1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.


I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.


I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.


https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators


Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.


Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}


In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).


The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()
Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.


Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: Core Contract Model Validators

Trevor.Conn@...
 

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case. Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

espy
 

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

espy
 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Malini Bhandaru
 

Thank you Trevor for the links! Shall check them out.

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

 

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

espy
 

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Trevor.Conn@...
 

Thanks for providing that summary Tony.


I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,


type AdminState string

type OperatingState string


That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.


As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct


type Addressable struct {
    ...

    isValidated bool
}


Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.


func (a Addressable) IsComplete() bool {
    return a.isValidated
}


I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.


Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".


Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf


Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators
 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Ian Johnson
 

+1 from me to continue on this approach

Feel free to also add me as a reviewer when you're ready - I've written some go reflection code for a few different projects.


On Thu, Apr 4, 2019, 17:55 <Trevor.Conn@...> wrote:

Thanks for providing that summary Tony.


I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,


type AdminState string

type OperatingState string


That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.


As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct


type Addressable struct {
    ...

    isValidated bool
}


Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.


func (a Addressable) IsComplete() bool {
    return a.isValidated
}


I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.


Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".


Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf


Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators
 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Goodell, Leonard
 

+1 from me also.

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of Ian Johnson
Sent: Thursday, April 04, 2019 5:39 PM
To: Trevor.Conn@...
Cc: Tony Espy <espy@...>; mbhandaru@...; EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

+1 from me to continue on this approach

 

Feel free to also add me as a reviewer when you're ready - I've written some go reflection code for a few different projects.

On Thu, Apr 4, 2019, 17:55 <Trevor.Conn@...> wrote:

Thanks for providing that summary Tony.

 

I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,

 

type AdminState string

type OperatingState string

 

That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.

 

As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct

 

type Addressable struct {
    ...

    isValidated bool
}

 

Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.

 

func (a Addressable) IsComplete() bool {
    return a.isValidated
}

 

I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.

 

Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".

 

Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

espy
 

On 4/4/19 6:55 PM, Trevor.Conn@... wrote:

Thanks for providing that summary Tony.


I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,


type AdminState string

type OperatingState string


That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.

I *think* what you're saying here is that if we changed the UnmarshallJSON() functions to call <model>.Validate() in adminstate.go and operatingstate.go, then we have the same problem as with Addressable that you solve below...

As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct


type Addressable struct {
    ...

    isValidated bool
}


Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.


func (a Addressable) IsComplete() bool {
    return a.isValidated
}


I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.

I'm not sure what you mean by the last sentence?

Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".


Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf


One downside to this approach is that once the isValidated flag is set, it can become stale if the instance is modified after being unmarshaled. If you wanted for instance to re-check the instance before saving to a persistent store, it wouldn't work. Now maybe we say our models are immutable once received, but I know at least one case (Add) where a model is modified after being unmarshaled. Perhaps there are more cases of this?


I'm pretty sure I mentioned another way to solve this last Thu, which is to make validation the responsible of the code that triggers the original unmarshal call vs. having each model's UnmarshallJSON() method be responsible for it. This gets rid of the duplicate Validate calls and is a bit simpler than adding a isValid flag to each model.  Checkout my code (based on your branch):


https://github.com/tonyespy/go-mod-core-contracts/commits/val-intf-awe


Regards,
/tony

Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators
 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Trevor.Conn@...
 

Hi Tony – An issue with the sample code in your branch is that Addressable must make a call to a.Validate() on line 179 instead of returning nil. This is because there are other endpoints that take an Addressable by itself independent of a Device Service. So we can’t rely on Device Service to always manage the Addressable validation for us.

 

As to your question about “request” types, I created a branch earlier that contains one for adding a provision watcher. Take a look here:

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug/requests/provision_watcher

 

Also, a short article describing the overall concept.

https://qualitycoding.org/request-model/

 

Trevor

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of espy
Sent: Sunday, April 7, 2019 5:07 PM
To: Conn, Trevor; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 6:55 PM, Trevor.Conn@... wrote:

Thanks for providing that summary Tony.

 

I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,

 

type AdminState string

type OperatingState string

 

That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.

I *think* what you're saying here is that if we changed the UnmarshallJSON() functions to call <model>.Validate() in adminstate.go and operatingstate.go, then we have the same problem as with Addressable that you solve below...

As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct

 

type Addressable struct {
    ...

    isValidated bool
}

 

Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.

 

func (a Addressable) IsComplete() bool {
    return a.isValidated
}

 

I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.

I'm not sure what you mean by the last sentence?

Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".

 

Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

One downside to this approach is that once the isValidated flag is set, it can become stale if the instance is modified after being unmarshaled. If you wanted for instance to re-check the instance before saving to a persistent store, it wouldn't work. Now maybe we say our models are immutable once received, but I know at least one case (Add) where a model is modified after being unmarshaled. Perhaps there are more cases of this?

 

I'm pretty sure I mentioned another way to solve this last Thu, which is to make validation the responsible of the code that triggers the original unmarshal call vs. having each model's UnmarshallJSON() method be responsible for it. This gets rid of the duplicate Validate calls and is a bit simpler than adding a isValid flag to each model.  Checkout my code (based on your branch):

 

https://github.com/tonyespy/go-mod-core-contracts/commits/val-intf-awe

 

Regards,
/tony

Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

espy
 

On 4/8/19 10:34 AM, Trevor.Conn@... wrote:

Hi Tony – An issue with the sample code in your branch is that Addressable must make a call to a.Validate() on line 179 instead of returning nil. This is because there are other endpoints that take an Addressable by itself independent of a Device Service. So we can’t rely on Device Service to always manage the Addressable validation for us.

Then the other endpoints just need to call Validate().

 

As to your question about “request” types, I created a branch earlier that contains one for adding a provision watcher. Take a look here:

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug/requests/provision_watcher

So this branch is yet another approach to the same problem? Sorry if I'm being dense here? This looks like you've added validation logic to MarshalJSON, doesn't this duplicate the same logic we have in the Validate approach?

/t

 

Also, a short article describing the overall concept.

https://qualitycoding.org/request-model/

 

Trevor

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of espy
Sent: Sunday, April 7, 2019 5:07 PM
To: Conn, Trevor; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 6:55 PM, Trevor.Conn@... wrote:

Thanks for providing that summary Tony.

 

I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,

 

type AdminState string

type OperatingState string

 

That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.

I *think* what you're saying here is that if we changed the UnmarshallJSON() functions to call <model>.Validate() in adminstate.go and operatingstate.go, then we have the same problem as with Addressable that you solve below...

As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct

 

type Addressable struct {
    ...

    isValidated bool
}

 

Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.

 

func (a Addressable) IsComplete() bool {
    return a.isValidated
}

 

I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.

I'm not sure what you mean by the last sentence?

Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".

 

Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

One downside to this approach is that once the isValidated flag is set, it can become stale if the instance is modified after being unmarshaled. If you wanted for instance to re-check the instance before saving to a persistent store, it wouldn't work. Now maybe we say our models are immutable once received, but I know at least one case (Add) where a model is modified after being unmarshaled. Perhaps there are more cases of this?

 

I'm pretty sure I mentioned another way to solve this last Thu, which is to make validation the responsible of the code that triggers the original unmarshal call vs. having each model's UnmarshallJSON() method be responsible for it. This gets rid of the duplicate Validate calls and is a bit simpler than adding a isValid flag to each model.  Checkout my code (based on your branch):

 

https://github.com/tonyespy/go-mod-core-contracts/commits/val-intf-awe

 

Regards,
/tony

Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Trevor.Conn@...
 

We’re talking in circles now. If you want to have a call, let me know but I have to start working on this.

 

Trevor

 

From: Tony Espy <espy@...>
Sent: Monday, April 8, 2019 11:16 AM
To: Conn, Trevor; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/8/19 10:34 AM, Trevor.Conn@... wrote:

Hi Tony – An issue with the sample code in your branch is that Addressable must make a call to a.Validate() on line 179 instead of returning nil. This is because there are other endpoints that take an Addressable by itself independent of a Device Service. So we can’t rely on Device Service to always manage the Addressable validation for us.

Then the other endpoints just need to call Validate().

 

As to your question about “request” types, I created a branch earlier that contains one for adding a provision watcher. Take a look here:

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug/requests/provision_watcher

So this branch is yet another approach to the same problem? Sorry if I'm being dense here? This looks like you've added validation logic to MarshalJSON, doesn't this duplicate the same logic we have in the Validate approach?

/t

 

Also, a short article describing the overall concept.

https://qualitycoding.org/request-model/

 

Trevor

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of espy
Sent: Sunday, April 7, 2019 5:07 PM
To: Conn, Trevor; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 6:55 PM, Trevor.Conn@... wrote:

Thanks for providing that summary Tony.

 

I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,

 

type AdminState string

type OperatingState string

 

That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.

I *think* what you're saying here is that if we changed the UnmarshallJSON() functions to call <model>.Validate() in adminstate.go and operatingstate.go, then we have the same problem as with Addressable that you solve below...

As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct

 

type Addressable struct {
    ...

    isValidated bool
}

 

Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.

 

func (a Addressable) IsComplete() bool {
    return a.isValidated
}

 

I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.

I'm not sure what you mean by the last sentence?


Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".

 

Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

One downside to this approach is that once the isValidated flag is set, it can become stale if the instance is modified after being unmarshaled. If you wanted for instance to re-check the instance before saving to a persistent store, it wouldn't work. Now maybe we say our models are immutable once received, but I know at least one case (Add) where a model is modified after being unmarshaled. Perhaps there are more cases of this?

 

I'm pretty sure I mentioned another way to solve this last Thu, which is to make validation the responsible of the code that triggers the original unmarshal call vs. having each model's UnmarshallJSON() method be responsible for it. This gets rid of the duplicate Validate calls and is a bit simpler than adding a isValid flag to each model.  Checkout my code (based on your branch):

 

https://github.com/tonyespy/go-mod-core-contracts/commits/val-intf-awe

 

Regards,
/tony

Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!


 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.


Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").


What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Re: [Edgex-tsc-core] Core Contract Model Validators

Trevor.Conn@...
 

Just to update everyone on the thread, I just talked with Tony and we’re on the same page. We will be taking the reflection-based approach with the IsValidated() check for complex types.

 

Trevor

 

From: Conn, Trevor
Sent: Monday, April 8, 2019 11:23 AM
To: 'Tony Espy'; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: RE: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

We’re talking in circles now. If you want to have a call, let me know but I have to start working on this.

 

Trevor

 

From: Tony Espy <espy@...>
Sent: Monday, April 8, 2019 11:16 AM
To: Conn, Trevor; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/8/19 10:34 AM, Trevor.Conn@... wrote:

Hi Tony – An issue with the sample code in your branch is that Addressable must make a call to a.Validate() on line 179 instead of returning nil. This is because there are other endpoints that take an Addressable by itself independent of a Device Service. So we can’t rely on Device Service to always manage the Addressable validation for us.

Then the other endpoints just need to call Validate().

 

As to your question about “request” types, I created a branch earlier that contains one for adding a provision watcher. Take a look here:

https://github.com/tsconn23/go-mod-core-contracts/tree/marshal_bug/requests/provision_watcher

So this branch is yet another approach to the same problem? Sorry if I'm being dense here? This looks like you've added validation logic to MarshalJSON, doesn't this duplicate the same logic we have in the Validate approach?

/t

 

Also, a short article describing the overall concept.

https://qualitycoding.org/request-model/

 

Trevor

 

From: EdgeX-GoLang@... <EdgeX-GoLang@...> On Behalf Of espy
Sent: Sunday, April 7, 2019 5:07 PM
To: Conn, Trevor; Bhandaru, Malini (VMware); EdgeX-GoLang@...; EdgeX-TSC-Core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 6:55 PM, Trevor.Conn@... wrote:

Thanks for providing that summary Tony.

 

I've been working on incorporating the "is validated" discussion from this morning into the feature branches below. There's a problem in that we have types that are aliased to primitives that won't support it. For example,

 

type AdminState string

type OperatingState string

 

That means if provided in the request, types like this will always be checked twice. No way around it unless we rewrite the types which could then have other side effects.

I *think* what you're saying here is that if we changed the UnmarshallJSON() functions to call <model>.Validate() in adminstate.go and operatingstate.go, then we have the same problem as with Addressable that you solve below...

As for a more complex type like Addressable, the solution is as follows. I add a non-exported member to the struct

 

type Addressable struct {
    ...

    isValidated bool
}

 

Then set the above to "true" when Validate() is called. I have extended the Validator interface to support an IsComplete() function returning a bool. The implementation looks like this.

 

func (a Addressable) IsComplete() bool {
    return a.isValidated
}

 

I've integrated this with the reflection code we went through this morning and it does work, but only for Addressable. Refactoring our request types as discussed this morning would get us out of needing to do this validation check at all, but I understand that's a lot of work.

I'm not sure what you mean by the last sentence?

Should I continue to pursue this with the understanding that we are only able to check a complex type for whether it's already been validated or not? If no one responds, I'm going to take that as a "Yes".

 

Latest changes are in this branch

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

One downside to this approach is that once the isValidated flag is set, it can become stale if the instance is modified after being unmarshaled. If you wanted for instance to re-check the instance before saving to a persistent store, it wouldn't work. Now maybe we say our models are immutable once received, but I know at least one case (Add) where a model is modified after being unmarshaled. Perhaps there are more cases of this?

 

I'm pretty sure I mentioned another way to solve this last Thu, which is to make validation the responsible of the code that triggers the original unmarshal call vs. having each model's UnmarshallJSON() method be responsible for it. This gets rid of the duplicate Validate calls and is a bit simpler than adding a isValid flag to each model.  Checkout my code (based on your branch):

 

https://github.com/tonyespy/go-mod-core-contracts/commits/val-intf-awe

 

Regards,
/tony

Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA


From: EdgeX-GoLang@... <EdgeX-GoLang@...> on behalf of espy <espy@...>
Sent: Thursday, April 4, 2019 3:27 PM
To: Bhandaru, Malini (VMware); Conn, Trevor; EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

[EXTERNAL EMAIL]

On 4/4/19 3:39 PM, Malini Bhandaru wrote:

Thank you Trevor for the links! Shall check them out.

Thanks for the reply Malini!

 

Am I correct in assuming that the validator will be invoked at the recipient endpoint of a rest call, and only then?

That's what we were discussing during our call today. The original problem was that a received JSON model was missing a required field. Trevor's branches both make the call to Validate in the UnmarshallJSON methods of our models. So when any of our services handle incoming REST calls with JSON payloads, the service calls json.Unmarshall, which in turn causes the UnmarshallJSON method of the top-level model, and all of it's children to be called recursively. The problem is that since Validate is called by each model's UnmarshallJSON method, you can end up with Validate being called more than once for some of the embedded models. My suggestion to call validate *after* the json.Unmarshall call was an alternative to adding a "valid" flag to each model in order to reduce the duplicate calls.

All that said, there's nothing that would prevent our services (or unit tests) from using Validate() in other areas of the code...

Then we could as part of our microservices have a base class that does validation as a first step, kind of like form input validation at the server side.

Unfortunately Go doesn't provide traditional OO inheritance, meaning there's no concept of base classes. Your analogy is correct though, with this addition, our services would validate REST input (CBOR or JSON instead of HTML forms) before doing anything with the resulting models.

Reflection used recursively does makes perfect sense.

 

There was a comment during today’s core WG call about annotations and min required fields.

One thing that this does not address is “context” based required fields. For example a person may need to share only name and phone in “contact” but

Needs to share name, phone and address for a parcel delivery type flow.

Actually this proposal allows each model to decide what is required in order for the model to be considered valid. The original bug was that a provisionwatcher was received and it lacked an "OperatingState". With this approach, the OperatingState Validate() method would return an error as since one wasn't received in the JSON, the OperatingState isn't set to one of the two allowed values ("Enabled" or "Disabled").

What do people think about using gRPC for inter microservice communication? Not for Edinburg but going forward. gRPC has schemas, efficient.

We had a long discussion about the merits of gRPC during our original design discussions about binary data support. In the end, we decided to go with a new serialization format (CBOR), but keep REST as the default mechanism for cross-service communications.

Note - JSON also supports schemas, and we discussed the merits of using a schema validation vs. an Go interface-based approach, and it was felt the latter was a better solution.

Regards,
/tony

But Trevor may point out that schemas also constrain that we provide more as opposed to just a name.

 

Was driving, so I was unsuccessful in unmuting, but want to chime in on log levels. I do think the default should not be trace.

 

Sincerely

Malini

 

From: <EdgeX-GoLang@...> on behalf of "espy via Lists.Edgexfoundry.Org" <espy=canonical.com@...>
Reply-To: "espy@..." <espy@...>
Date: Thursday, April 4, 2019 at 12:13 PM
To: "Conn, Trevor (EMC)" <Trevor_Conn@...>, "EdgeX-GoLang@..." <EdgeX-GoLang@...>, "edgex-tsc-core@..." <edgex-tsc-core@...>
Cc: "EdgeX-GoLang@..." <EdgeX-GoLang@...>
Subject: Re: [Edgex-golang] [Edgex-tsc-core] Core Contract Model Validators

 

On 4/4/19 9:49 AM, Tony Espy wrote:

On 4/2/19 6:38 PM, Trevor.Conn@... wrote:

Another update on this –

 

This week in the #core channel on Slack Ian Johnson asked me why the approach below didn’t involve creating a Validator interface and using reflection to go through the members of a type, calling Validate() on those that implemented the interface. As those on the last Core WG call will remember, this was the original idea I proposed. So since Ian effectively +1’ed my original proposal I decided to implement it for a side by side comparison.

 

The changes only needed to be made to the core-contracts module and can be found here:

https://github.com/tsconn23/go-mod-core-contracts/tree/validator_intf

 

This is the first Go-based reflection code I’ve written, so there could be other ways to do this. However, I used the JSON Unmarshaler implementation from the standard library as an example.

I did a bit of research on this and there's been at least one formal golang proposal to enhance the JSON annotation language to cover missing fields:

https://github.com/golang/go/issues/16426

That said, although the proposal was shot down, the use of a Validator interface was discussed as an alternative.

I'm +1 on the overall approach (including use of reflection).

 

With the current structure of our requests, it’s hard to know where to invoke the validation from. In this example, the DeviceService is the top-level struct so I start the process here. However one could argue that an Addressable should be able to validate independently since we can add/update an Addressable on its own. That would mean the Addressable would end up being validated twice in this case.

I think always validating when unmarshalling an external request makes sense, and one could argue that validating if/when we construct a new instance of a model makes sense too. I'm not sure I follow the exact flow where we'd validate twice?

An alternate/simpler approach to that solves the duplicate validation calls problem is just to make validation a responsibility of the original caller (ie. the function/method that makes the first json.Unmarshall()) instead of always calling Validate() at the end of a model's UnmarshallJSON().

Regards,
/tony

Since we pass a deeply nested multii-purpose model around instead of Request types specific to the API endpoint that could be used for this kind of control, it probably is what it is unless we refactor.

 

Also, w/r/t recursion, I’d look to accomplish that through the Validate() functions at each node in the graph. So DeviceService calls Validate() on all its sub-types where applicable, then those sub-types would do the same followed by their sub-types and so on.

Seems reasonable

Regards,
/tony

 

Let me know your thoughts. We’ll probably be talking about this at this week’s meeting.

 

Trevor

 

 

From: Conn, Trevor
Sent: Saturday, March 30, 2019 4:27 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: Core Contract Model Validators

 

Hi all -- I wanted to follow up on this discussion we've been having in the Core Working Group about validating our contract models and where that responsibility lies. This past week we talked about two options.

 

1.) Using a JSON Schema-based validation

2.) Creating a validator implementation that can be called as part of the UnmarshalJSON operation found in some of our models.

 

I explored the JSON Schema solution a bit on Thursday afternoon with help from Intel, but I did not find it palatable. The schema has to be created and maintained then loaded at runtime from either URL, file or inline string. Child types have their schema, but that schema must be either be repeated in the schema of the parent type or referenced via a pointer. This all seemed like too much work so before getting seriously into it, I decided to work on the validator idea.

 

I created a couple feature branches which you can review/pull at the links below. This involves changes to both core-contracts and edgex-go.

 

https://github.com/tsconn23/go-mod-core-contracts/tree/validators

https://github.com/tsconn23/edgex-go/tree/validators

 

Rather than ProvisionWatcher I have used DeviceService in my example above because it already had an UnmarshalJSON() method in place, and also with the ProvisionWatcher I'd have to account for DeviceProfile as well. I just wanted to get something up and going to try and prove the concept, so the scope was more reduced by using DeviceService.

 

Here's a sample request I'm using to test:

POST http://localhost:48081/api/v1/deviceservice

{"_class":"org.edgexfoundry.domain.meta.DeviceService","adminState":"unlocked","operatingState":"disabled1","name":"ds_test1","description":"manage home variable speed motors","lastConnected":0,"lastReported":0,"labels":["hvac","variable speed motor"],"origin":1471806386919}

 

In the request body above there are two issues. First, the operatingState value is invalid. Second, the Addressable is missing. If you execute the above locally, you'll receive a 400 (Bad Request) with the error message "invalid OperatingState 'disabled1'".

If you fix the OperatingState value, then you'll still receive a 400 with a new error message "Addressable ID and Name are both blank". This is coming from the validator inside of the Addressable model within core-contracts. I noticed that we'll have the opportunity to slim our controller logic by moving the server-side validation we do today into the models. We'll also have consistent state validation (example: validation today is different depending on if you're adding a new Device Service versus updating an existing one).

 

The order of operations with regard to unmarshaling and validation looks like this

  • DeviceService
    • UnmarshalJSON
      • OperatingState (if provided in the request)
        • UnmarshalJSON
        • validate()
      • AdminState (if provided in the request)
        • UnmarshalJSON
        • validate()
    • validate()
      • OperatingState
        • validate()
      • AdminState
        • validate()

Herein is the only thing that gives me minor heartburn about the solution. The validate() method of both OperatingState and AdminState are called as part of their respective UnmarshalJSON implementations. It doesn't have to be but I thought it made sense to do so since it allows us to fail faster if we detect a problem.

 

Assuming those pass, then DeviceService calls its own validate() method which takes care of validating all child types. This is necessary in case a child type's key was omitted in the JSON. This means validation for any of the child types can occur twice -- a cheap operation to be sure, but redundant at times.

 

As an aside, you will see changes in the feature branches related to the elimination of models.Service from core-contracts. Nothing else was composed from this model other than the DeviceService and I felt that eliminating it would simplify the solution.

 

Comments are welcome. If this looks like a viable approach to folks, then there might be some bandwidth soon to undertake adding UnmarshalJSON to the models that do not have them, and implementing the necessary validate() handlers.

 

Trevor Conn
Technical Staff Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

Marking Events for CBOR Flow

Trevor.Conn@...
 

Hi all -- During today's Core Working Group call as we were discussing various ways for export-distro / app-functions-sdk to notify core-data that an event had been successfully handled, I believe Eric Cotter suggested that another option might be to use a checksum. After taking a long walk this evening, I began to think that might be worth exploring. I wrote the following very simple harness to help illustrate the idea.


https://github.com/tsconn23/event-checksum


The README gives instructions on how to build and run the app. Each time the app is run, a different checksum will be generated, owing largely to the different timestamp values on the event and readings during each execution. We could potentially integrate this kind of solution in the following manner.


1.) A CBOR event comes in.

2.) We obtain a checksum value for the byte array

3.) We store that checksum as part of the Event model in core-data

4.) The CBOR byte array is published through the bus

5.) The subscriber receives the byte array and also calculates the checksum

6.) When the subscriber is done handling the event, it calls a PUT on core-data passing the checksum

    6a.) This would entail a modification to the request type of the existing API

7.) Core-data locates the event and marks it as "pushed"


Given all of the elements that go into an event and its readings, I think the chance of obtaining a duplicate checksum is remote. If that's naive, please correct me. I'm only suggesting we do this for CBOR events right now but it could be done for any event since all it needs is a byte array.


Trevor Conn
Senior Principal Software Engineer
Dell Technologies | IoT DellTech
Trevor_Conn@...
Round Rock, TX  USA

[EdgeX-TSC-Core] CBOR - Re-encode Performance Metrics

Anthony Bonafide <anthonymbonafide@...>
 

Hello all,

    Last week during the Core Working Group call we were discussing a couple of ways to address some issues regarding capturing ID and time information when dealing with CBOR content in core-data. One option was to decode the original content, update the necessary data and re-encode to CBOR before publishing. Before going down that path we wanted to grab some performance data on the re-encoding process. I have created a Github repo which simulates the process of re-encoding similar to how we would want to implement the previously mentioned steps. There are instructions in the README on how to run the benchmark tests, along with the test structure.

     In a nutshell, the benchmark accepts 2 arguments. The first specifies how many iterations to execute, the second specifies the Event payload size. The options for the payload size are small, medium, and large which correspond to 100K,900K,and 12M respectively(to match the results shared by Tobias regarding checksum implementation). The logic for the bench mark works as follows:
  1. Create CBOR bytes of an Event with a payload of the specified size
  2.  Capture system metrics and capture the start time
  3. Decode CBOR to an Event
  4. Change the ID and Pushed fields
  5. Encode the updated Event
  6. Repeat steps 3-5 as per the specified number of iterations
  7. Calculate elapsed time
  8. Display metrics

    A run of the tests on my laptop resulted in the following:

Metric Event with 100K payload Event with 900K payload Event with 12M payload
Average Re-encode time(nano seconds) 4148 4497 4554
GC runs 590 755 186
GC stop the world time(milliseconds) 57 71 21
Memory allocations(Memory at test completion – memory before test) 30000096 30000148 30000046
System memory in bytes (Memory test completion – memory before test) 1179648 1441792 1179648

    There is more information provided by the CLI tool, but those are what I felt might be the most important. Also, there are a few Go Benchmark tests in the repo which benchmark encoding, decoding, and the re-encoding process as listed above. Each of the tests are isolated and provide another way to get performance metrics. The instructions for executing the Go Benchmark tests can also be found in the Go Benchmark Section of the README. Hopefully this helps and if anyone has any ideas or suggestions regarding more and/or better metrics feel free to reach out.


Thank you,

Anthony Bonafide

Re: [EdgeX-TSC-Core] CBOR - Re-encode Performance Metrics

Tobias Mosby
 

Yes, this is very helpful to see the metrics we can anticipate for re-encode, garbage collection, and memory utilization.

Thank you Anthony!

 

Are the timings you measured for re-encode in units of microseconds?  I re-ran and was looking to decode the actual value; will follow up with you.

 

This morning we discussed sharing anticipated response times for SHA256 vs MD5 checksum hashing algorithms.

Eric also suggested a look at XXHash so it is included below for comparison as well.

 

NOTES:

·         CBOR encode/decode was performed using go-codec v1.1.4 which is not yet added to EdgeX codebase. I also found that setting up a buffered reader adversely impacts CBOR encode performance.

·         Each run was provided same three payload sizes (small/medium/large), performed across 1k iterations respectively, with rolling average used to produce result set.

·         Only intentional difference between each run was to apply a different checksum method to compute the event/payload hash.

·         Takeaway: As seen in Run #3, xxHash.Checksum64 is demonstrably more performant. So if “weaker” (but speedy) hashes such as this were used in place of uuid and re-encode of the event model, mitigation of collisions may need to be addressed at persistence layer. For example, FIFO to resolve which event record to mark “pushed” if/when distinct events are assigned the same hash value; ref: https://github.com/Cyan4973/xxHash/issues/165

 

Run #1

Dimension

100kb Payload

900kb Payload

12MB Payload

Binary Image Load:

53µs

439µs

3850µs

CBOR Encode:

33µs

177µs

1700µs

Checksum (SHA256):

598µs

6000µs

85000µs

CBOR Decode:

90µs

1000µs

81000µs

Overall Iteration Time:

0.75ms

7.5ms

171ms

TOTAL Time (1000 iterations)

0.8s

7.5s

2m51s

 

Run #2

Dimension

100kb Payload

900kb Payload

12MB Payload

Binary Image Load:

49µs

452µs

3900µs

CBOR Encode:

50µs

205µs

1900µs

Checksum (MD5):

208µs

1900µs

24000µs

CBOR Decode:

63µs

980µs

77000µs

Overall Iteration Time:

0.37ms

3.5ms

107ms

TOTAL Time (1000 iterations)

0.4s

3.5s

1m47s

 

Run #3

Dimension

100kb Payload

900kb Payload

12MB Payload

Binary Image Load:

46µs

411µs

4000µs

CBOR Encode:

36µs

182µs

1800µs

Checksum (xxHash):

15µs

121µs

1700µs

CBOR Decode:

64µs

805µs

84000µs

Overall Iteration Time:

0.16ms

1.5ms

91ms

TOTAL Time (1000 iterations)

162ms

1.5s

7m36s

 

*Resulting metrics in the tables above must be taken as anecdotal (mileage may vary) since these were executed on my laptop in a VM with other services such as EdgeX stack running.

 

Best regards,
Toby

 

From: EdgeX-TSC-Core@... [mailto:EdgeX-TSC-Core@...] On Behalf Of Anthony Bonafide
Sent: Monday, April 22, 2019 2:14 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: [EdgeX-TSC-Core] CBOR - Re-encode Performance Metrics

 

Hello all,

 

    Last week during the Core Working Group call we were discussing a couple of ways to address some issues regarding capturing ID and time information when dealing with CBOR content in core-data. One option was to decode the original content, update the necessary data and re-encode to CBOR before publishing. Before going down that path we wanted to grab some performance data on the re-encoding process. I have created a Github repo which simulates the process of re-encoding similar to how we would want to implement the previously mentioned steps. There are instructions in the README on how to run the benchmark tests, along with the test structure.

 

     In a nutshell, the benchmark accepts 2 arguments. The first specifies how many iterations to execute, the second specifies the Event payload size. The options for the payload size are small, medium, and large which correspond to 100K,900K,and 12M respectively(to match the results shared by Tobias regarding checksum implementation). The logic for the bench mark works as follows:

  1. Create CBOR bytes of an Event with a payload of the specified size
  2.  Capture system metrics and capture the start time
  3. Decode CBOR to an Event
  4. Change the ID and Pushed fields
  5. Encode the updated Event
  6. Repeat steps 3-5 as per the specified number of iterations
  7. Calculate elapsed time
  8. Display metrics

 

    A run of the tests on my laptop resulted in the following:

 

Metric

Event with 100K payload

Event with 900K payload

Event with 12M payload

Average Re-encode time(nano seconds)

4148

4497

4554

GC runs

590

755

186

GC stop the world time(milliseconds)

57

71

21

Memory allocations(Memory at test completion – memory before test)

30000096

30000148

30000046

System memory in bytes (Memory test completion – memory before test)

1179648

1441792

1179648

 

    There is more information provided by the CLI tool, but those are what I felt might be the most important. Also, there are a few Go Benchmark tests in the repo which benchmark encoding, decoding, and the re-encoding process as listed above. Each of the tests are isolated and provide another way to get performance metrics. The instructions for executing the Go Benchmark tests can also be found in the Go Benchmark Section of the README. Hopefully this helps and if anyone has any ideas or suggestions regarding more and/or better metrics feel free to reach out.

 

 

Thank you,

 

Anthony Bonafide

Re: [EdgeX-TSC-Core] CBOR - Re-encode Performance Metrics

Tobias Mosby
 

I had inadvertently recorded a timing for the last cell in the timings as 7m36s. This resulted from my run of 5000 iterations on the final check, in attempt to smooth out noise on average timings occurring between runs. Updated for consistency.

 

Note the variance in timings between runs will be reduced by running longer tests. We should not expect a chosen checksum method to impact encode/decode timings, etc.

 

From: Mosby, Tobias
Sent: Thursday, April 25, 2019 8:19 AM
To: 'Anthony Bonafide' <anthonymbonafide@...>; 'EdgeX-GoLang@...' <EdgeX-GoLang@...>; 'edgex-tsc-core@...' <edgex-tsc-core@...>; 'EdgeX-TSC-Device-Services@...' <EdgeX-TSC-Device-Services@...>
Subject: RE: [EdgeX-TSC-Core] CBOR - Re-encode Performance Metrics

 

Yes, this is very helpful to see the metrics we can anticipate for re-encode, garbage collection, and memory utilization.

Thank you Anthony!

 

Are the timings you measured for re-encode in units of microseconds?  I re-ran and was looking to decode the actual value; will follow up with you.

 

This morning we discussed sharing anticipated response times for SHA256 vs MD5 checksum hashing algorithms.

Eric also suggested a look at XXHash so it is included below for comparison as well.

 

NOTES:

·         CBOR encode/decode was performed using go-codec v1.1.4 which is not yet added to EdgeX codebase. I also found that setting up a buffered reader adversely impacts CBOR encode performance.

·         Each run was provided same three payload sizes (small/medium/large), performed across 1k iterations respectively, with rolling average used to produce result set.

·         Only intentional difference between each run was to apply a different checksum method to compute the event/payload hash.

·         Takeaway: As seen in Run #3, xxHash.Checksum64 is demonstrably more performant. So if “weaker” (but speedy) hashes such as this were used in place of uuid and re-encode of the event model, mitigation of collisions may need to be addressed at persistence layer. For example, FIFO to resolve which event record to mark “pushed” if/when distinct events are assigned the same hash value; ref: https://github.com/Cyan4973/xxHash/issues/165

 

Run #1

Dimension

100kb Payload

900kb Payload

12MB Payload

Binary Image Load:

53µs

439µs

3850µs

CBOR Encode:

33µs

177µs

1700µs

Checksum (SHA256):

598µs

6000µs

85000µs

CBOR Decode:

90µs

1000µs

81000µs

Overall Iteration Time:

0.75ms

7.5ms

171ms

TOTAL Time (1000 iterations)

0.8s

7.5s

2m51s

 

Run #2

Dimension

100kb Payload

900kb Payload

12MB Payload

Binary Image Load:

49µs

452µs

3900µs

CBOR Encode:

50µs

205µs

1900µs

Checksum (MD5):

208µs

1900µs

24000µs

CBOR Decode:

63µs

980µs

77000µs

Overall Iteration Time:

0.37ms

3.5ms

107ms

TOTAL Time (1000 iterations)

0.4s

3.5s

1m47s

 

Run #3

Dimension

100kb Payload

900kb Payload

12MB Payload

Binary Image Load:

46µs

411µs

4000µs

CBOR Encode:

36µs

182µs

1800µs

Checksum (xxHash):

15µs

121µs

1700µs

CBOR Decode:

64µs

805µs

84000µs

Overall Iteration Time:

0.16ms

1.5ms

91ms

TOTAL Time (1000 iterations)

162ms

1.5s

7m36s 1m33

 

*Resulting metrics in the tables above must be taken as anecdotal (mileage may vary) since these were executed on my laptop in a VM with other services such as EdgeX stack running.

 

Best regards,
Toby

 

From: EdgeX-TSC-Core@... [mailto:EdgeX-TSC-Core@...] On Behalf Of Anthony Bonafide
Sent: Monday, April 22, 2019 2:14 PM
To: EdgeX-GoLang@...; edgex-tsc-core@...
Subject: [EdgeX-TSC-Core] CBOR - Re-encode Performance Metrics

 

Hello all,

 

    Last week during the Core Working Group call we were discussing a couple of ways to address some issues regarding capturing ID and time information when dealing with CBOR content in core-data. One option was to decode the original content, update the necessary data and re-encode to CBOR before publishing. Before going down that path we wanted to grab some performance data on the re-encoding process. I have created a Github repo which simulates the process of re-encoding similar to how we would want to implement the previously mentioned steps. There are instructions in the README on how to run the benchmark tests, along with the test structure.

 

     In a nutshell, the benchmark accepts 2 arguments. The first specifies how many iterations to execute, the second specifies the Event payload size. The options for the payload size are small, medium, and large which correspond to 100K,900K,and 12M respectively(to match the results shared by Tobias regarding checksum implementation). The logic for the bench mark works as follows:

  1. Create CBOR bytes of an Event with a payload of the specified size
  2.  Capture system metrics and capture the start time
  3. Decode CBOR to an Event
  4. Change the ID and Pushed fields
  5. Encode the updated Event
  6. Repeat steps 3-5 as per the specified number of iterations
  7. Calculate elapsed time
  8. Display metrics

 

    A run of the tests on my laptop resulted in the following:

 

Metric

Event with 100K payload

Event with 900K payload

Event with 12M payload

Average Re-encode time(nano seconds)

4148

4497

4554

GC runs

590

755

186

GC stop the world time(milliseconds)

57

71

21

Memory allocations(Memory at test completion – memory before test)

30000096

30000148

30000046

System memory in bytes (Memory test completion – memory before test)

1179648

1441792

1179648

 

    There is more information provided by the CLI tool, but those are what I felt might be the most important. Also, there are a few Go Benchmark tests in the repo which benchmark encoding, decoding, and the re-encoding process as listed above. Each of the tests are isolated and provide another way to get performance metrics. The instructions for executing the Go Benchmark tests can also be found in the Go Benchmark Section of the README. Hopefully this helps and if anyone has any ideas or suggestions regarding more and/or better metrics feel free to reach out.

 

 

Thank you,

 

Anthony Bonafide