What is a good way to deal with tasks that require arrays using Haskell?

11

Often a task requires real arrays. Take for instance the task to implement Befunge or ><>. I tried to use the Array module for this, but it's really cumbersome, as it feels like I'm coding much too verbose. Can anybody help me how to solve such code-golf tasks less verbose and more functional?

FUZxxl

Posted 2011-03-24T20:19:51.473

Reputation: 9 656

AFAIK, this site is only for code golf itself, not for related questions. I'm guessing that this belongs on Stackoverflow. – Jesse Millikan – 2011-03-24T20:21:18.910

@Jesse Millikan: Please also see this question and read the faq. It doesn't states, that you're not allowed to ask any questions about how to golf. This kind of questions were also a big part of the "on topic" question in the definition phase of this site. Please double-think your downvote and remove it if you understand why I'm asking this here.

– FUZxxl – 2011-03-24T20:24:50.377

Hmm, my bad I guess. – Jesse Millikan – 2011-03-24T20:25:39.297

@Jesse Millikan: Errare humanum est. – FUZxxl – 2011-03-24T20:26:59.203

The FAQ is not very clear about it, though. – Jesse Millikan – 2011-03-24T20:27:44.847

Yes it is. I'm going to open a meta-question on this issue now. – FUZxxl – 2011-03-24T20:30:23.837

I've made the question CW, in line with similar "golfing tips in XXX"-type questions. Once we have more consensus with what to do, then more can happen (or not). – Chris Jester-Young – 2011-03-24T23:32:16.200

For me, this question is ok here, but if this is on Stack Overflow, this wouldn't need to make it CW, and answerers could earn reps at least. – YOU – 2011-03-25T05:50:45.903

Answers

5

First of all, I recommend looking at Data.Vector, a nicer alternative to Data.Array in some cases.

Array and Vector are ideal for some memoization cases, as demonstrated in my answer to "Finding maximum paths". However, some problems simply aren't easy to express in a functional style. For example, Problem 28 in Project Euler calls for summing the numbers on the diagonals of a spiral. Sure, it should be pretty easy to find a formula for these numbers, but constructing the spiral is more challenging.

Data.Array.ST provides a mutable array type. However, the type situation is a mess: it uses a class MArray to overload every single one of its methods except for runSTArray. So, unless you plan on returning an immutable array from a mutable array action, you'll have to add one or more type signatures:

import Control.Monad.ST
import Data.Array.ST

foo :: Int -> [Int]
foo n = runST $ do
    a <- newArray (1,n) 123 :: ST s (STArray s Int Int) -- this type signature is required
    sequence [readArray a i | i <- [1..n]]

main = print $ foo 5

Nevertheless, my solution to Euler 28 turned out pretty nicely, and did not require that type signature because I used runSTArray.

Using Data.Map as a "mutable array"

If you're looking to implement a mutable array algorithm, another option is to use Data.Map . When you use an array, you kind of wish you had a function like this, which changes a single element of an array:

writeArray :: Ix i => i -> e -> Array i e -> Array i e

Unfortunately, this would require copying the entire array, unless the implementation used a copy-on-write strategy to avoid it when possible.

The good news is, Data.Map has a function like this, insert:

insert :: Ord k => k -> a -> Map k a -> Map k a

Because Map is implemented internally as a balanced binary tree, insert only takes O(log n) time and space, and preserves the original copy. Hence, Map not only provides a somewhat-efficient "mutable array" that is compatible with the functional programming model, but it even lets you "go back in time" if you so please.

Here is a solution to Euler 28 using Data.Map :

{-# LANGUAGE BangPatterns #-}

import Data.Map hiding (map)
import Data.List (intercalate, foldl')

data Spiral = Spiral Int (Map (Int,Int) Int)

build :: Int -> [(Int,Int)] -> Map (Int,Int) Int
build size = snd . foldl' move ((start,start,1), empty) where
    start = (size-1) `div` 2
    move ((!x,!y,!n), !m) (dx,dy) = ((x+dx,y+dy,n+1), insert (x,y) n m)

spiral :: Int -> Spiral
spiral size
    | size < 1  = error "spiral: size < 1"
    | otherwise = Spiral size (build size moves) where
        right   = (1,0)
        down    = (0,1)
        left    = (-1,0)
        up      = (0,-1)
        over n  = replicate n up ++ replicate (n+1) right
        under n = replicate n down ++ replicate (n+1) left
        moves   = concat $ take size $ zipWith ($) (cycle [over, under]) [0..]

spiralSize :: Spiral -> Int
spiralSize (Spiral s m) = s

printSpiral :: Spiral -> IO ()
printSpiral (Spiral s m) = do
    let items = [[m ! (i,j) | j <- [0..s-1]] | i <- [0..s-1]]
    mapM_ (putStrLn . intercalate "\t" . map show) items

sumDiagonals :: Spiral -> Int
sumDiagonals (Spiral s m) =
    let total = sum [m ! (i,i) + m ! (s-i-1, i) | i <- [0..s-1]]
     in total-1 -- subtract 1 to undo counting the middle twice

main = print $ sumDiagonals $ spiral 1001

The bang patterns prevent a stack overflow caused by the accumulator items (cursor, number, and Map) not being used until the very end. For most code golfs, input cases should not be big enough to need this provision.

Joey Adams

Posted 2011-03-24T20:19:51.473

Reputation: 9 929

9

The glib answer is: Don't use arrays. The not-so-glib answer is: Try to rethink your problem so that it doesn't need arrays.

Often, a problem with some thought can be done without any array like structure at all. For example, here's my answer to Euler 28:

-- | What is the sum of both diagonals in a 1001 by 1001 spiral?
euler28 = spiralDiagonalSum 1001

spiralDiagonalSum n
    | n < 0 || even n = error "spiralDiagonalSum needs a positive, odd number"
    | otherwise = sum $ scanl (+) 1 $ concatMap (replicate 4) [2,4..n]

What gets expressed in the code here is the pattern of the sequence of numbers as they grow around the rectangular spiral. There was no need to actually represent the matrix of numbers itself.

The key to thinking beyond arrays is to think about what the problem at hand actually means, not how you might represent it as bytes in RAM. This just comes with practice (perhaps a reason I code-golf so much!)

Another example is how I solved the finding maximal paths code-golf. There, the method of passing the partial solutions as a wave through the matrix, row by row, is expressed directly by the fold operation. Remember: On most CPUs, you can't deal with the array as a whole at once: The program is going to have to operate through it over time. It may not need the whole array at once at any time.

Of course, some problems are stated in such a way that they are inherently array based. Languages like ><>, Befunge, or Brainfuck have arrays at their heart. However, even there, arrays can often be dispensed with. For example, see my solution to interpreting Brainfuck, the real core of its semantics is a zipper. To start thinking this way, focus on the patterns of access, and the structure closer to the meaning of problem. Often this needn't be forced into a mutable array.

When all else fails, and you do need to use an array - @Joey's tips are a good start.

MtnViewMark

Posted 2011-03-24T20:19:51.473

Reputation: 4 779