Search This Blog

Loading...

Friday, January 4, 2008

Dynamic Typing More Superior?

There are a lot of discussions about the pros and cons of Dynamic and Static typed Languages. After being exposed to both types of languages, I can clearly say which types of language I favor: static typing.

Yes, the old, bloated and ugly static typed such as C# or Java, not dynamic typed languages such as Python or Ruby.

The critics of static type languages point out that static typing entails the declaration of types, which is an bad because it increases the code size, which in turns increase the bug counts, because bug counts usually grow with the growth of code size.

Big codebase is bad, As Steve Yegge mentioned in his post:
Wow! I've sure seen that before, and I could really empathize with him. Geoworks had well over ten million lines of assembly code, and I'm of the opinion that this helped bankrupt them (although that also appears to be a minority opinion – those industry programmers just never learn!) And I worked at Amazon for seven years; they have well over a hundred million lines of code in various languages, and "complexity" is frequently cited internally as their worst technical problem.
But I am not sure that dynamic typing is a panacea to the complexity problem.

Static typing requires one to declare the type of a variable upfront, and it does take a few characters to do it. But it's just that, a few characters per variable. Does that few characters increase the code complexity by a large margin and make the code significantly harder to read or harder to develop? I don't think so. I think the converse is true, for knowing the type of a variable makes it easier for one to understand the code.

Let's have a thought experiment. You would rather call
  1. void CalculateCurrency(object[] currencies), or
  2. void CalculateCurrency(string[] currencies)
I bet most of us would like to call 2, because we want to know the types so that we can make less mistakes in our coding, and if we do, we can detect the typing errors during compile time.

But a dynamic typing language will not offer you this option. You will always have to work with unknown types. You don't know you are using the wrong types until you run your program and get a runtime error. In this sense, dynamic typing actually increases the complexity because of
  1. No runtime checks
  2. Maintaining code without typing information is harder.
  3. Primitive IDE intellisense and refactoring tools.
Since the typing error is very real, the dynamic typing proponents are forced to resort to
  1. Have a set of unit tests just to test the returned variable types
  2. Adopt a sort of notation when comes to variable naming. For example, prefix the type in front of a variable name(Hungarian notation)
  3. Document extensively every variables types so as not to create confusion
Or else how can they claim that "dynamic languages are better"?. But none of these completely solve the error typing problem; they can only reduce it to a certain extent. And-- this is important-- all of them come at the expense of other things. Let's examine them one by one:
  1. If you use unit tests to validate types, then you are writing extra test code. Test code needs to be maintained, just like your production code. So in a sense this technique does not reduce the codebase at all, in fact, it increases it.
  2. If you want to specify the type information in your variable's name, then your variables will have very long names and this makes your code harder to read too. Sure, you can reduce the binary size of your apps during installation time, but your program, written in high level language and which size is measured by the row program files size will not necessarily decrease.
  3. Extensive documentation for every variable? Are you kidding? Programmers don't read or maintain documentation!
To address the problem of big codebase, one should refactor, reuse and refocus. Refactor the code to make it more readable, reuse as much software components as possible and refocus on implement the essential features the customer's really want. Skipping the type declaration in an attempt to make the codebase smaller just won't cut it.

Strong Typing vs Strong Testing?
Bruce Eckel concluded his excellent post Strong Typing vs. Strong Testing by saying that having compiler checks for typing constraint is not sufficient to guarantee a bug-free program. We need strong testing, not strong typing.

I agree.

But these two are not mutually exclusive. You can have strong testing when you have strong typing. The point is with strong typing, you transfer the type validation task from your test code to compilers. This results in lesser code being written.

Eckel mentioned that
And if a Python program has adequate unit tests, it can be as robust as a C++, Java or C# program with adequate unit tests (although the tests in Python will be faster to write).
There is a big if.

Although unit test is good, but there are things that cannot be, or are very hard to be unit tested, such as UI code or data layer code (For the automated testing of UI code, you may need to use GUI test tools, but even that is difficult). Static typing languages at least enforce some checks on the programming code by examining the types during compile time, but dynamic typing languages are powerless to enforce any checks.

In this regard, rejecting static typing language because it's not sufficient to solve all the software development problems does not seem like a good idea.

Follow-up post: Dynamic Typing Reduces Code Complexity? A Reply

11 comments:

Simon said...

Hmm. It seems you missed the point entirely.

The whole idea is that without static typing, you don't CARE what type something is. It doesn't matter if it's an 'object' or a 'string' or anything else, if it implements an interface that is compatible with what the programmer intended, it works with no modifications. And the programmer doesn't even have to define said interface.

The errors will arise when trying to use a method on an object that doesn't support it.

There's no reason to test the return values of things, because you're simply not interested. If a method returns something that is useless, an error will pop up anyway, but in the case where it returns something that is in fact useful, but the original programmer didn't take into account, things will work out seamlessly.

It can be difficult to debug if the error occurs on a value in the call stack far from where it was returned, but then again it rarely even happens, and the added flexibility is definitely worth it.

Soon Hui said...

Simon, thank you for your comment.

As you said, dynamic typing will generate an error when one uses a method on an object that doesn't support it.

But for programmers, they need to fix these errors. Since the compilers won't do the job for them, they would have to create unit tests, or extensive document their methods parameter to let they themselves know how to use the methods.

And this is the point of my post: creating unit tests and extensive documentation or whatever do not help to reduce the code complexity at all.

noblemaster said...

@soon: Great article!

@simon: Your argument actually supports soon's case. Errors will occur because you call methods that do not exist in an object. There is no way to tell you because of dynamic typing.

And as far as testing goes: statically typed languages do not require you to test for the correct type. Testing occurs at a much higher level. You basically evade all the low-level type checks in statically typed languages.

Todd Werth said...

I think you're confusing a few things about dynamic languages, which is common for people who come from a static language background. I was similarly confused when I came to dynamic languages from c++ and Java.

First, dynamic languages don't make programs smaller just because you don't have to place the type before the variable. You can achieve that in statically typed languages with type inference. Dynamic typed languages allow you to basically have automatic generics and interfaces. This is where the true savings comes it.

It's similar to generics in Java/c#; you have no idea what type will be used at runtime, but you can still implement your features and as long as the runtime object implements the interface, it will work. Dynamic languages, basically, automatically implement that interface for you.

Once again, you just don't care about the type. You care what an object can DO, not what an object IS. You may check if an object implements a certain method (message) but you would never check or verify an object's type.

It's confusing, and it certainly is a different way to work. But once it "clicks" in your mind and you fully understand, it really starts to makes sense; allowing you to express complex logic in a much simpler form.

Static typing has it's advantages too. As dynamic languages have existed for decades, the pros and cons of each are well documented.

You should give dynamic languages more of a chance. Work with them until you get to the point where you're aren't just writing Java/c# style code in Ruby/Python/etc, but fully utilizing the benefits of the dynamic language. Even if you never actually use a dynamic language in everyday programming, the experience will give you a richer view of what is possible in Software Engineering.

Soon Hui said...

Todd, thank you for your comments. But there are a few points I would like to respond to.

Dynamic typed languages allow you to basically have automatic generics and interfaces. This is where the true savings comes it.

It's similar to generics in Java/c#; you have no idea what type will be used at runtime, but you can still implement your features and as long as the runtime object implements the interface, it will work. Dynamic languages, basically, automatically implement that interface for you.

This I agree.

But the difference between static typing and dynamic typing is, as I have pointed out, static typing language compilers can check the types of the objects to see whether they conform to a given interface, whereas dynamic typing language compilers can't do that at all.

For dynamic typing languages, of course the program will work if an object implements the required methods, but what if it doesn't have the correct implementation? you will need to wait until runtime to find out.

Jeremy Weiskotten said...

I understand your concern about "mystery types". I had the same before I started programming in Ruby (formerly in Java and C++ before that).

However, the issue just doesn't come up that often in the real world. You don't find yourself wondering, "Is this argument a String or something else?" Even when improving existing code that you've never seen before, it's usually not much of a problem as long as variables and methods were named well.

In fact, in your example, typing the argument as an array of string doesn't help much. What does it mean for a currency to be a string? Is it a standard abbreviation like "USD" for US Dollars, or is it spelled out "US Dollars", or is it simply "dollars"? Are they currency amounts (in which case I'd expect the type to be numeric)? There are many potential levels of ambiguity that you'd need documentation/tests/trial-and-error to resolve.

I find it analogous to life in Java before generics. It was rare (at least in my experience) to encounter a bug where someone put an object of the wrong type into a collection -- ClassCastExceptions are not all that common.

Mikkel Garcia said...

As others have mentioned, coming from a C++/Java background into Ruby or Python - it can be difficult to understand the merits of a dynamicly typed language.

(In my experience) Many teams I've encountered in the Java world only have the barest of unit tests, or sometimes no unit tests at all. One of the reasons I've attributed to this is that the compiler tells the developers if there is something terribly out of place. That warm fuzzy feeling of compiler acceptance is mistaken for program correctness.

In an interpretted dynamic language, there is no fuzzy feeling - leading to more developers seeking to test program correctness (instead of compiler acceptance).

The more people I work with testing their code - the less effort we spend fixing their code, and the more we can spend addressing the use cases.

There are many more reasons to learn and use other languages (not just Ruby and Python, but Haskell, Erlang, Scala, etc), and I hope you don't write them off because a feature seems wrong.

Some just take getting used to.
-Mikkel

Anonymous said...

To expand upon some of the other comments, it's all where you choose to draw the line between what the compiler should do and what the programmer should do.

For example, Java does not implement Design By Contract(TM) (http://en.wikipedia.org/wiki/Design_by_contract) in the form of preconditions, postconditions, and invariants. The Java compiler, therefore, cannot automatically check program logic beyond mere type declarations. Hence, Java should never be used, and we should all be programming in Eiffel...right?

Most Java programmers have probably never heard of Design By Contract. Those exposed to the idea might well regard it as silly, since there are other ways of ensuring program correctness, like typing. And, Design By Contract "bakes in" those rules into the program, meaning they cannot adapt to changing circumstances except by changing the code. I suspect a lot of Java programmers would consider Design By Contract to be a whole lot of pain for very little gain.

That's the same mindset that fans of dynamic typing have for static typing. It's all where you choose to draw the line between what the compiler does and what the programmer does.

Matthew said...

You wrote:

"Since the typing error is very real, the dynamic typing proponents are forced to resort to

1. Have a set of unit tests just to test the returned variable types
2. Adopt a sort of notation when comes to variable naming. For example, prefix the type in front of a variable name(Hungarian notation)
3. Document extensively every variables types so as not to create confusion"

I have not seen 1,2, or 3 in practice, but my experience is limited. Are there specific projects you have in mind with regard to these comments?

Soon Hui said...

Matthew,

you may want to refer to strong typing vs strong testing for an example on unit test.

2) was the option I adopted for my Jscript projects because it was insanely difficult to create unit tests on them.

3) was also the approached I used when I needed to pass back and forth custom objects. I am sure other people are doing the same as well. Or else how can you communicate your code intention?

Ross said...

"Since the typing error is very real, the dynamic typing proponents are forced to resort to..."

So your complaint is that writing in a dynamically-typed language encourages you to test your code, have naming conventions, and document your interfaces?

I must be missing something, those all sound like good things to me...