Software Engineering
architecture dynamic-typing static-typing stack
Updated Fri, 27 May 2022 07:54:04 GMT

Dynamic typing across the whole technology stack - where to enforce data validity?


Over the past year or two, I've been playing with newer technologies in my side projects. As a web developer, I've gone from the following (and still the following, at work):

The 'classic' technology stack

  1. Web browser POSTs forms to...
  2. a C#-coded web application server-side, communicating to...
  3. external services via XML, then ultimately...
  4. writing the application state to a SQL database.

To using a very different arrangement:

The fully-dynamic technology stack

  1. Web browser submitting XmlHttpRequests to...
  2. a JavaScript-coded Node.js server-side, communicating to...
  3. other external services via RESTful services in JSON, ultimately...
  4. writing the application state to a no-SQL database

It's gotten to the point where my whole stack has absolutely no enforcement of type or schema, anywhere.

Now, this has been just fine before, when consuming others' web services. It might have even been fine up until the SQL databases were ditched (along with their DB schemas). But now I'm stuck at:

Where do I declaratively define what is the valid structure of my business-domain data?

I want to enforce data validity before making any of my projects publically available - after all, what's to stop someone from just submitting invalid data to my services, and using it all as just a hacky free database provider? At some point, it has to be enforced that "this web service will only accept a collection of at least one X. X must contain A, B, and optionally C".

The first solution I thought of was to do validation inside my node.js, through a big block of imperative code/if-elses/etc. This felt wrong.

I have been using CouchDB, and for a while I thought that it might be best to put that validation code in the _update handlers. At least we're performing validation as part of persistence, but it's still an ugly imperative block.

Next, I looked into JSON schema languages. There is no standard as far as I can see, and I wasn't confident in the multiple solutions offered. I could roll my own, but then I'd only be expanding the body of non-standard JSON schema languages.

XML? I could put the X back into AJAX, and have them all schema-validated on the server side. That doesn't seem to follow the trends in software development, however. Neither would using an XML data store instead of a JSON CouchDB/BSON MongoDB persistence layer.

So, I'm stuck. Ideas?

TL;DR Where do I declaratively define the valid structure of my business-domain data when I am using no static Object Oriented language, and no schema-bound SQL database, in my technology stack? Is it possible that there must be some static typing (or DB schema, or validated XML) somewhere for a technology stack to make any sense? If so, where?




Solution

Sounds to me like you're about to create an inner platform to compensate the lack of semantics inherent to the language you chose.

Even schema definitions (be it XML DTD, or JSON schema or mongoose schema) will not provide you the safety of statical analysis. All you can really use them for is to guarantee that your system doesn't silently run into undefined behavior and ultimately fail.

I am not really sure why you won't use a language, that simply provides this out of the box. Using a dynamically typed language and embedding type constraints into that seems to combine the worst of both worlds. Even more so, because modern statically typed languages are able to infer vast parts of those constraints implicitly.

Personally I suggest you take a look at Haxe's JavaScript backend. There's a site dedicated to node.js development with Haxe - you could start there. Haxe's anonymous types can quickly be used to tie in JSON sources in a type safe manner without any runtime overhead. Still, if you wish, its meta programming facilities allow you to automagically generate validation code from that at compile time.

Of course Haxe is by far not the only option out there to target JavaScript in a typesafe manner. So if you feel you need the benefits of static typing, the you should invest the time to find a language to your liking, that actually embeds this information right into its statically analyzable semantics.





Comments (5)

  • +3 – Static typing != input validation. — May 10, 2012 at 11:53  
  • +2 – @JoeriSebrechts: Any decent (i.e. sufficiently reflective) statically typed environment offers the possibility to generate basic validation from the type information (and some sort of annotations if necessary). Take: typedef Person = { firstName:String, lastName:String, email:Email, phone:Phone, mobile:Null<Phone> }. This has enough information to validate a registration form. I am not saying that static type systems can express all relationships. But it can express all the relationships one would find in a schema definition. — May 10, 2012 at 12:49  
  • +1 – @JoeriSebrechts: I disagree. Decent type systems allow to formulate these kinds of constraints. In fact pretty much any kind of constraint that checks the validity of a field in isolation. From the above type definition, you can easily generate validation code, that will check whether the provided phone number is valid for example. Why should I write any kind of code to do all this by hand, if I can embed this information into my language and then use metaprogramming to generate the needed code with it? — May 11, 2012 at 09:00  
  • +1 – @JoeriSebrechts: Take a look at Haskell's type system... — May 11, 2012 at 12:33  
  • +1 – Input validation is sometimes called the "boundary issue". When dealing with the boundary issue, static types is not enough, because you will never have enough information to statically elide all type information at compile time. You are forced to dynamically wrap and unwrap and check data just like how dynamic types work in dynamically typed languages, but now this only occurs at the boundary. This means we either have to embrace a dynamic typing methodology similar to Clojure's schema (github.com/plumatic/schema) or embrace dependent types. Once data is validated it can be tagged. — Sep 12, 2016 at 11:36