CS-270 Blog-Cherveny: April 2014

Wednesday, April 30, 2014

Final Haskell Project: A brief study in Parrallism

Nested List Processing

As we have learned previously in this blog, Haskell is a language where lists are very important and therefore unsurprisingly one of the most important issues that must be addressed when programming in Haskell is how exactly are operations applied to lists?

In order to see if we can find some of the issues with operations on lists, let's write a quick maximum function that takes a list and returns the max value of the list.

This is a pretty standard way to do this in Haskell, let's take a closer look about how it works.

Fist we must import the Data.List library

Next we can define a function and give it a type definition.

The first line of actual code reads:

maximum':: (Ord a) => [a] -> a

You may remember this means we will be ordering something in this function. In this case we tell the function to take a list ([a]) and return a single value (a).

The next line simply takes the element of the list and sets it to the Lambda Variable x

Finally the line maximum' (x:xs) = max x (maximum' xs)starts at the head of the list and recursively goes through and finds the maximum value in the list.

So far this is nothing we haven't already done in this blog and so we can expect to have success in our simple maximum function. Here are a few examples:

Success!

And again nothing really surprising here.

So where does the problem come in? Consider this, what if we had a list of lists and I wanted to find the biggest value in the entire collection? What would I happen if I just gave our regular function a list of lists? Let's see!

So here is the same function with a list of lists being passed in, rather than a list of simple integers:

So what we would want to happen is have the function return a value of 11. This is the result when this method is run:

Shoot! That's not quite what we were expecting. Why was a list returned instead of a single integer? The way this method is currently written can not handle nested lists. It simply finds the list with the greatest number of elements instead of the highest value from any list. How can we modify this?

In Haskell a pretty popular way to handle these lists is to flatten them. This is done by using a method within the standard Data.List library called intercalate. This method takes a list of lists and flattens it down to one list. Here is how I decided to implement this in Haskell:

And here is the result:

Now we can observe the result of 11 that we might have expected from the first method. This begs the question, which is a better way to handle lists? Well that depends if you have nested list values and you want a simple maximum of any element in any list or sub-list then the method using flattening is the correct way to do it. However if you have data and you want to know which list has the most elements in it the first way with a non-flattened list of lists would be the correct approach.

This is the example I decided to look at for my final example of Haskell programming. Overall I enjoyed learning about this language even though it was far more challenging than I had anticipated. Functional programming requires a far different way of thinking than the traditional object orientated programming that I had been familiar with. It was an interesting study.

I hope you had as much fun reading this blog as I had writing it.

That's all for now.

Tuesday, April 15, 2014

Haskell Week 9

Syntactic Sugar:

Syntactic sugar is a way of making code more concise which makes it easier to not only write but also easier to read.

Haskell is notorious for using these little short cuts to make bulky code "sweeter" and easier to understand. For example in a number of examples in this blog the syntax I used was the syntactic sugar equivalent for longer less attractive code.

Here is one example of a map that we looked at earlier in this blog

As you can see I used a map to add one to each of the Integers in the list [1,3,5,7,9] thanks to the magic of syntactic sugar I was able to simply say map (+1) and it knew exactly what I was talking about. The way Haskell can do this is by taking my "sweet" version of syntax and converting it back to the clunky unappealing code. In this case map (+1) translates to \x -> x + 1 which simply makes a lambda and assigns it to the value of x +1.

Another "sweet" piece of syntax in the same example is the list itself. To make the list I simply put the numbers I want to be members and they are added to a new list. Magic right? Wrong! Syntactic sugar! Once again Haskell interpreted my list as (:) 1 ((:) 3 ((:) 5 ((:) 7 ((:) 9 [])) This is the longhand way of creating a list, it uses cons (constructors (NOT LIKE IN JAVA)) to add each element to an empty list until the list is no longer empty.

Other shortcuts (Most of which I have used):

You can let Haskell do a lot of the work for you specifically with sequences. If I were to write print ([1..4]) Haskell would know to return 1,2,3,4 because I set a start point (1) and an end point (4) it then understand the ".." to mean "fill in the blanks here" The longer way to do this would be to use the built in function called enumFromTo and call it on the tuple 1, 5 like so: enumFromTo 1 5.

Even such things as if-else blocks (which really uses cases) are generally written in the short hand. Which I find to be an okay way to do it as long as there is an understanding of the "unsweet" code behind the abbreviations.

Monday, April 14, 2014

Haskell Week 8

Exceptions:

Exceptions and the way we handle them in Haskell are very similar to the methods we use in Java. Exceptions can be thrown, caught, and handled.

Some of the most common exceptions in Haskell include:

isAlreadyExistsError
isDoesNotExistError
isAlreadyInUseError
isFullError
isEOFError
isIllegalOperation
isPermissionError
isUserError

I find that the error reporting is much more readable and helpful in Haskell then in some of the other languages we have looked at, such as Java, VB, C#, and Python.

Here is a little error that I was given when I tried to run a method without creating an instance of anything to be returned to the console:

<interactive>:22:1:

No instance for (Fractional [t0]) arising from a use of `myDiv1'

Possible fix: add an instance declaration for (Fractional [t0])

In the expression: myDiv1 50 [1, 2, 5, 0, ....]

In an equation for `it': it = myDiv1 50 [1, 2, 5, ....]

<interactive>:22:8:

No instance for (Num [t0]) arising from the literal `50'

Possible fix: add an instance declaration for (Num [t0])

In the first argument of `myDiv1', namely `50'

In the expression: myDiv1 50 [1, 2, 5, 0, ....]

In an equation for `it': it = myDiv1 50 [1, 2, 5, ....]

<interactive>:22:12:

No instance for (Num t0) arising from the literal `1'

The type variable `t0' is ambiguous

Possible fix: add a type signature that fixes these type variable(s)

Note: there are several potential instances:

instance Num Double -- Defined in `GHC.Float'

instance Num Float -- Defined in `GHC.Float'

instance Integral a => Num (GHC.Real.Ratio a)

-- Defined in `GHC.Real'

...plus three others

In the expression: 1

In the second argument of `myDiv1', namely `[1, 2, 5, 0, ....]'

In the expression: myDiv1 50 [1, 2, 5, 0, ....]

Not only is the error shown, but there are also possible solutions that are given to attempt to correct the error. I have found that these solutions are actually very helpful and often times lead to fixing the problem and producing working code.

Haskell Week 7

Concurrency vs Parallelism

According to the Haskell Documentation there is a distinct difference between concurrency and parallelism.

Parallelism is "a technique to make programs faster by performing several computations in parallel. This requires hardware with multiple processing units" (Haskell)

Whereas, concurrency "refers to techniques that make program more usable. Concurrency can be implemented and is used a lot on single processing units, nonetheless it may benefit from multiple processing units with respect to speed". (Haskell)

So what exactly is the difference?

According to Simon Mar, in his blog ghcmutterings, "a concurrent program is one with multiple threads of control" all of which operate on one processor and share that processors resources.

This leads to the question what is a thread?

Threads are contained inside of processes a process can have multiple threads all of which share the processes resources, but they are also able to accomplish goals individually. An application may have multiple processes, and these processes may spin up multiple threads. The key to this is they all happen on one processor.

The key distinguishing factor with parallel processing is that more than one CPU is needed.

There has been an extensive amount of research as to the benefits of parallelism.

I found Simon Mar's research to be very helpful in trying to understand this very complex and high level issue

http://community.haskell.org/~simonmar/par-tutorial.pdf