Domain Specific Languages – Programming Languages
Articles,  Blog

Domain Specific Languages – Programming Languages


Why do you believe the future of programming language design lies in domain specific languages when we can just do operator overloading and APIs in a general language like C? Well, that’s a tough question and actually there is a bit of philosophy or opinion here. In class, we keep brushing against Turing completeness, and it turns out that languages like C++ or C# or Java or MATLAB are all going to allow us to express the same computations ultimately–that is, if it’s possible to do it in one language, it’s also possible to do it in the other, but that doesn’t say anything about how easy it is, how much help the language will give you about making mistakes, and how quick the final result will be. In my opinion, three important features that the domain specific language gives you over just coding up special types or operations in a general purpose language like C++ or C# are the conciseness of the representation, the ability to do type checking or run-time checking or otherwise have safety built-in, and the ability of the compiler or run-time system to do optimization. Let’s take something like MATLAB, which is very good at doing mathematical operations or matrices or matrix multiply in a concise manner, and take it as sort of a running example. You could code up the same sorts of operations, matrix multiply, and language like C or C++ and C#, and you might start out by having them the functions that take a number of arguments in making your own data types. And to that, it’s relatively clear that languages like MATLAB, which just like to use the star to multiply two matrices, is going to be more concise. However, once you start adding operator overloading or feature of languages like C++ or C# that allows you to change the meaning of symbols like plus or stars that they, in essence, call functions you define. Then, the conciseness argument is more of a wash sort of a tie for both ends. However, the other two aspects, type safety or type checking and optimization, are still really critical. A language specifically designed to handle Mathematics or matrices something like MATLAB is going to be able to notice potentially more easily if you make mistakes related to that particular domain. For example, in C or C++, often a two-dimensional matrix in array is really just a single array carefully embedded. There’s some sort of stride or approach where we reuse the elements of the indices in a very long one-dimensional array as if it were a two-dimensional array. And it is really easy to make mistakes to pass in arrays or matrices that have the wrong dimensions to matrix-matrix multiply, to confuse row major order and column major order, and language like C or C++ depending on which matrix implementation you’re using won’t give you any support with that. It will silently let you shoot yourself in the foot. Compute the wrong answer. You might not even crash. You might just get something you’re not expecting. And that’s really problematic because these days the constraint is often programmer time rather than CPU time. I’d rather use a language where these things are built-in as first-class citizens, and there’s the possibility that it will alert me to error. Many of you may have a favorite C++ or C# or Java matrix library that would catch those errors. Again, ultimately, since these languages have equivalent expressive power, you can add that sort of error checking to any language or library. But often domain-specific languages do a better job at it. And the third example is then optimization. The higher-level instructions you give to a compiler or interpreter, the more scope it has for creativity, the more chances it has to reorder your statements or implement them in other way, the closer you can get to just being declarative. I want to multiply these matrices, and I don’t care how you do it. The more the compiler under the hood is able to take advantage of things like memory hierarchies, caches, special multimedia instructions you might have in order to get that sort of thing done well under the hood. If you actually write out your matrix-matrix multiply as three nested for loops, you’re forcing the compiler to generate code for that particular implementation. Often domain specific languages allow you to express things like matrix transposition or multiplication at a very high level. And thus, they actually end up generating better code for new target architectures than you might do if you were to code it up yourself. So you end up spending less time writing. It’s concise. You get guarantees, and it’s faster. And in fact, you can view the push to domain specific languages as just an extension of the push from assembly languages to high level languages. One of the first arguments made in favor of higher level languages were early studies in computer science that found that the number of lines of code that programmers could write per day essentially over the lifetime of a project was constant regardless of what language you are using. You could either pay your programmers and get ten lines of assembly or you could pay your programmers and get ten lines of C or ten lines of Python or some such. And if you have experience with multiple languages, typically you can get significantly more done in ten lines of Python than you can in ten lines of assembly because of built-in support for dictionaries or lambda, higher order data types, object-oriented programming, that sort of thing. Ultimately, you can get everything done with assembly, but it would take longer. The same argument here applies to domain specific languages. If I have a new more exotic domain-specific language, let’s pick MacroLab. That’s a favorite one of mine for programming wireless sensor networks for moving data around and performing distributed computations. Let’s say for example that you want to keep track of people passing through your storefront advertising window to see if it’s actually working in attracting people. You might make up some sort of wireless sensor network to keeps track. You want to write programs for that. You could do them and see your assembly language or you could use this sort of domain specific language since it has notions of distributed computation where data lives, sending information back and forth, built-in as primitives. It’s able to check mistakes for you, generate good code for you, and in general, improve your productivity. Could you get all of that done with a well-crafted library? Yes. And in some sense, ultimately at the end of the day, maybe there is no difference between the well-crafted library and a domain specific language. But I think we’ll see a lot of the initial effort or improvement common to domain specific language side and then the libraries will follow and catch up.

2 Comments

Leave a Reply

Your email address will not be published. Required fields are marked *