Why we don’t validate incoming xml
Often people ask me why we don’t validate the incoming xml against our xsd contracts at runtime.
It seems like a logical idea, why write an xsd and then ignore it at runtime.
Below I will give 3 reasons.
This is the simplest of all the reasons, yet it is not unimportant. Validation with an xsd is not a cheap operation and doing it for each request when most of the requests are valid is a waste of time. CPU cycles isn’t the only consideration here, In order to validate against an xsd you need the xsd. For that you need to either include the xsds as embedded resources or deploy the xsds with the binaries and load them at runtime. In both cases this will enlarge the memory footprint which is something we can’t afford on low-spec machines.
There is a general concept in api design that says you should be strict with your output but liberal with your inputs. Basically anything you can tolerate without crashing you should agree to work with even if it violates the spec a bit. Here as well, there may be certain elements that violate the schema but that you don’t really have an issue with, there is no reason, in such a scenario, to throw an exception just because it is in the xsd.
You may wonder how such a thing can happen? How is it that the code doesn’t really require what the xsd says. I can imagine a few simple scenarios where this can happen.
The simplest is that you didn’t write the contract. The contract may have been written by a third party and we implement it and don’t really care about all the things they care about. Another scenario is element reuse, In most of the use cases a certain field is required so it is marked as such, but in certain scenarios it is not. There may also be elements that are conceptually required but you can live without them (maybe header info). You may also have a regex restriction to prevent illegal characters (maybe for security reasons) that can be resolved by sanitizing the input rather that throwing an exception.
3) Forward Compatibility
This is the biggest and the only unsolvable issues that I can think of. Lets say that In version 1 we had 3 elements and in version 1.1 we added an optional 4th element. This is a fully backwards compatible change so we only incremented the minor version. Now what if a customer has both versions deployed and he upgraded his requests to work with the new optional field?. If we validate against the xsd and the customer sends version 1.1 to version 1.0 (which should be ok) it will fail xsd validation.