Is it possible to solve TSORT with Haskell?

hase · February 7, 2014, 5:22am

I recently decided to learn Haskell by solving Code Chef problems. Unfortunately, the problem TSORT is proving to be quite a challenge (it should not, I think).

As can be seen from the list of submissions, no one solved this problem with Haskell so far: Link

This particular submission can sort a list of 10^6 integers randomly generated in 0.5seconds on my machine, but still failing the 5 seconds time limit on CodeChef.

Any suggestions to solve this problem with Haskell?

Thank you…

kuruma · February 7, 2014, 5:34am

Hello,

First of all, I must say that you must be a really, really very versatile coder… Understanding I/O in Haskell, which implies understanding Monads (at least, the IO Monad) is actually one of the major difficulties I’ve came across while learning Haskell and it is, imho, one of the biggest challenges a programmer can face on a more theoretical vs. practical level… It is also the reason why I haven’t used Haskell to solve codechef problems… (I’m also learning Haskell and am still a newbie)

With that being said, have you read stuff about Haskell’s FFI? It might be your best shot here… if you can plug it in your code:

Haskell’s FFI and other I/O tricks

More specifically, read point 8 on the link

Best,

Bruno

EDIT: Can’t you process input data in a way that you read it to a list and then output the whole sorted list at once?

If you import Data.List and manage to process input into a list, say L, then it’s simply a matter of doing:

sort L

and then to output something like:

putStr (show (sort L)) or something alike…

hase · February 7, 2014, 6:08am

Thank you for the link. Lot’s of things to learn there. I didn’t worry much about IO because I managed to pass the INTEST problem with Haskell.

kuruma · February 7, 2014, 6:25am

Maybe we can learn Haskell together? Yes, Haskell wiki is a great place
And I still need to attempt INTEST myself

hase · February 7, 2014, 11:00pm

@kuruma Haskell is very fun to learn, unfortunately I don’t think it is suited for CodeChef Not until I find an elegant solution for this problem, something that don’t force me to use C inside Haskell.

kuruma · March 18, 2014, 5:52am

Hello @all again,

Motivated by the question of PHP being broken on Codechef, I was wondering is it is possible to solve this problem in Haskell?

I’ve tried a lot of things but nothing seems to work…

Best,

Bruno

garakchy · March 18, 2014, 7:49am

look at here bro: Haskell solutions

kunal361 · March 18, 2014, 7:50am

@hase @kuruma…are u sure u need to sort the numbers in TSORT??? cant u try using this algo…http://www.codechef.com/viewsolution/2541169

galactic · March 20, 2014, 10:42am

Hi, I am new to Haskell as well and I am stock in this problem as well here is my solution, which runs in like 1 sec which is obviously way under 5 sec but I get a TLE.

import qualified Data.ByteString.Lazy.Char8 as B
import Data.Maybe
import Data.Array

readInt :: B.ByteString -> Int
readInt = fst . fromJust . B.readInt

countingSort :: (Ix n) => [n] -> n -> n -> [n]
countingSort l lo hi = concatMap (uncurry $ flip replicate) count
  where count = assocs . accumArray (+) 0 (lo, hi) . map (\i -> (i, 1)) $ l

main = do
  _ <- getLine  
  content <- fmap B.lines B.getContents
  let arr = map (\y -> (readInt y)) content
  mapM_ print (countingSort arr 0 1000000)

I will keep trying some other things but I don’t know what is the bottle neck here. So is going to be difficult.

davislor · November 14, 2017, 10:43pm

This is a great example of how to code and what to optimize. There are many programmers here better than me, but I’ve figured out some tips and tricks about optimizing Haskell for the problems in the past month, and my solutions are pretty consistently twice as fast as the other Haskell submissions.

This isn’t how you do that; it’s a simplified example. But I’ll be getting to it.

I won’t need to bring up monads again (although there’s no need to be afraid of them). There’s a really handy function that pretty much solves the problem of console I/O for you. It’s called interact, and it’s a higher-order function. That is, its one argument is a function, which takes a string as input and produces another string as output. What interact does is read in pass all the input to this function as soon as it’s read, and print whatever it outputs. Since Haskell evaluates lazily, it can work on only part of the input, and doesn’t have to wait for all of it. It won’t matter for this problem, but a function written with interact is completely able to run interactively, reading in each input line one at a time and printing each result as soon as it’s ready.

Here’s a really idealized walkthrough of how you might solve the problem naïvely. Haskell has the really useful feature that, if you write _, it’s called a hole, and the compiler will warn you that it’s there and tell you its type. So the process of filling in holes is called typed-hole programming.

Let’s start with a basic skeleton. We want to read in a list of numbers as [Int], sort it, and output the sorted list. With the interact function I mentioned, the basic skeleton of our program is:

main :: IO()
main = interact (output . compute . input) where
  input = _
  compute = _
  output = _

If we try to compile this in ghc, we get three error messages, one per hole, about how the compiler Found hole _ :: t Where t is a rigid type variable bound by the inferred type of … Which makes sense when you think about it: we haven’t given the compiler any clue what the types of these three helper functions should be. Whenever the compiler can’t deduce a type, we need to declare it explicitly:

main :: IO()
main = interact (output . compute . input) where
  input :: String -> [Int]
  input = _

  compute :: [Int] -> [Int]
  compute = _

  output :: [Int] -> String
  output = _

The first thing we need to do is split the input into lines and ignore the first line. (One thing you’ll often find yourself doing for contests that’s a bad habit in the real world is skipping input validation.) So we want to look for a function of type String -> [String], which is Prelude.lines. The last thing we need to do is flatten our output items into a single output String with newlines after each line, which is Prelude.unlines. (By the way, if you’re ever wondering what the right function is called, you can search all the Haskell libraries for functions with a given type signature in Hoogle.) That partly fills in two of the holes, but we need to put new ones in between them and the rest of the program:

main :: IO()
main = interact (output . compute . input) where
  input :: String -> [Int]
  input = _ . tail . lines

  compute :: [Int] -> [Int]
  compute = _

  output :: [Int] -> String
  output = unlines . _

Now we get some useful error messages:

• Found hole: _ :: [String] -> [Int]
• In the first argument of ‘(.)’, namely ‘_’
  In the expression: _ . tail . lines
  In an equation for ‘input’: input = _ . tail . lines

• Found hole: _ :: [Int] -> [String]
• In the second argument of ‘(.)’, namely ‘_’
  In the expression: unlines . _
  In an equation for ‘output’: output = unlines . _

GHC tells us what the type signature of the function we need is. It even gives us a list of the relevant local variables. That doesn’t help us at the moment, but it is usually the list of building blocks we have to fill a typed hole with.

So we need a function of type [String] -> [Int] between lines (which returns [String]) and compute (which takes [Int]), and another of type [Int] -> [String] between compute and unlines. The right one doesn’t happen to exist in the standard library, but we can map functions onto lists:

main :: IO()
main = interact (output . compute . input) where
  input :: String -> [Int]
  input = map _ . tail . lines

  compute :: [Int] -> [Int]
  compute = _

  output :: [Int] -> String
  output = unlines . map _

Now GHC tells us that the first hole has type String -> Int and the last has type Int -> String. The generic conversion functions to and from String are defined as typeclasses, and are named read and show respectively:

main :: IO()
main = interact (output . compute . input) where
  input :: String -> [Int]
  input = map read . tail . lines

  compute :: [Int] -> [Int]
  compute = _

  output :: [Int] -> String
  output = unlines . map show

Now we just need a function to sort a list of type [Int]. Haskell has a sorting function for generic ordered types, Data.List.sort, so let’s import it:

import Data.List (sort)

main :: IO()
main = interact (output . compute . input) where
  input :: String -> [Int]
  input = map read . tail . lines

  compute :: [Int] -> [Int]
  compute = sort

  output :: [Int] -> String
  output = unlines . map show

You’ll want to have at least two testcases for every problem saved as text files, one that’s short and simple, and one that’s the maximum size, contains every boundary case, and as hard as possible.

However, this will be much too slow to be accepted. Better solutions to come.

davislor · November 15, 2017, 1:11am

Since these replies are listed as answers, not in the order they were posted, this is a follow-up to my previous answer.

Haskell is a fast language. I’ve gotten performance not far behind C or C++ when running the same algorithms on the same data structures. In every case I’ve found where an optimized Haskell solution wasn’t fast enough, the problem was with the algorithm or data structure I was using, and if you’re more than slightly behind, there’s a better approach that will get you the answer with a lower order of time complexity. In the most pathological case, a mutable STUArray implementation of CSUBQ was good enough to get ten points, the same as its translation into C++, and that’s the single data structure Haskell is worst at. (In the real world, use Data.Vector for mutable Haskell arrays. Unfortunately, it’s not available on CodeChef at present, but it’s supposed to be added to the default distribution soon.)

The lesson there, for me, was not to prematurly optimize so much. I spent a lot of time getting the mutable arrays in the ST monad right, and I learned something from it, but the real solution was to use a different data structure, segment trees.

You will, however, have to start with one optimization: fast I/O. The first snag you’re going to run into, on every single problem, is that the standard Haskell I/O library is extremely slow. Why is that? Remember, I said Haskell is fast when using the same data structures as other languages. The I/O library in Prelude reads and writes String data, which are lists of Char. If you implemented your strings as singly-linked lists of Unicode characters, they’d be slow in C++ too!

The efficient data structure for output is arrays of characters, and ASCII is enough for (nearly?) any problem on this site. An even better one for this purpose is a list of buffers, dynamically allocated as needed. That exists in Haskell, as Data.ByteString.Lazy. Specifically, since we only need to support ASCII, Data.ByteString.Lazy.Char8. (In the real world, always use UTF-8 whenever possible; the library supports that too.) And the fastest way to write your output is to generate, at runtime, a data structure of chained operations to till these buffers. These are in Data.ByteString.Builder. It even gives you back just a little bit of control over the size and behavior of your buffers.

An equivalent skeleton with fast I/O might be:

import Data.ByteString.Builder (Builder, char7, intDec, toLazyByteString)
import qualified Data.ByteString.Lazy.Char8 as B8
import Data.Monoid ((<>))

main :: IO()
main = B8.interact ( output . compute . input ) where
  input :: B8.ByteString -> [Int]
  input = map perLine . tail . B8.lines where
    perLine = decode . B8.readInt

    decode (Just (x, _)) = x
    decode Nothing = error "Invalid input: expected integer."

  compute :: [Int] -> [Int]
  compute = _

  output :: [Int] -> B8.ByteString
  output = toLazyByteString . foldMap perCase where
    perCase :: Int -> Builder
    perCase x = intDec x <> char7 '\n'

And it says a lot about Haskell that the operations to combine the optimized, low-level data structures from I/O are found in the typeclasses Semigroup and Monoid, not to be confused with Monad. It’s actually really simple: a semigroup has a single binary operation <>, where (a <> b) <> c always equals a <> (b <> c), and a monoid also has an empty element such that mempty <> a and a <> mempty both equal a. Making it that generic is what allows us to use Prelude functions like foldMap and syntactic sugar like <> with Builder transparently. Still, it is a bunch of abstract algebra to express the concept of append.

It turns out that Data.List.sort with a variant of this wrapper was fast enough to be accapted. But that’s not how you’re supposed to solve the problem. Take a close look at the conditions: all the numbers are integers between 1 and 1,000,000. There’s a O(max (m, n)) algorithm to sort this array, where n is the size of the input and m the size of the domain. Since the maximum and minimum values of t were set so low, it’s great for this problem. It’s called counting sort. Make an array a covering the entire domain, initialized to zeroes. Each time you encounter a number i, increment a[i] (in Haskell, a!i). Finally, for every i in the domain, output i, a[i] times (which could be zero).

Both updating an array and iterating through an array in Haskell are concepts from imperative languages, and we’ll use library functions to get out of having to do them. If you check the documentation for Data.Array.IArray, you’ll find that there already is a function in the standard library to build an array from a default value and an accumulating function, accumArray, and a function to read out an array as a list of (index, value) pairs, assocs. There are also a few different kinds of array, but in this case, we want UArray from Data.Array.Unboxed. (Boxed arrays, where the elements are evaluated lazily, are great for dynamic programming tables, though!) That lets us partly fill in the hole:

import Data.Array.IArray (accumArray, assocs)
import Data.Array.Unboxed (UArray)
import Data.ByteString.Builder (Builder, char7, intDec, toLazyByteString)
import qualified Data.ByteString.Lazy.Char8 as B8
import Data.Monoid ((<>))

main :: IO()
main = B8.interact ( output . compute . input ) where
  input :: B8.ByteString -> [Int]
  input = map perLine . tail . B8.lines where
    perLine = decode . B8.readInt

    decode (Just (x, _)) = x
    decode Nothing = error "Invalid input: expected integer."

  compute :: [Int] -> [Int]
  compute = _ . assocs . countingSort . _ where
    countingSort :: [(Int, Int)] -> UArray Int Int
    countingSort = accumArray (+) 0 (lower, upper)

    lower = 0
    upper = 1000000

  output :: [Int] -> B8.ByteString
  output = toLazyByteString . foldMap perCase where
    perCase :: Int -> Builder
    perCase x = intDec x <> char7 '\n'

The first hole is now a function from [Int] -> [(Int, Int)], where the first member of the pair is an index and the second is a count of 1. We’ll turn this into a function on individual numbers with map. The second hole is now a function from [(Int, Int)] -> [Int], where again, the first member of the pair is an index and the second is a count of 1. This time, each entry in the list could produce any number of output items, depending on the count, so we’ll combine them with concatMap. This gives us:

import Data.Array.IArray (accumArray, assocs)
import Data.Array.Unboxed (UArray)
import Data.ByteString.Builder (Builder, char7, intDec, toLazyByteString)
import qualified Data.ByteString.Lazy.Char8 as B8
import Data.Monoid ((<>))

main :: IO()
main = B8.interact ( output . compute . input ) where
  input :: B8.ByteString -> [Int]
  input = map perLine . tail . B8.lines where
    perLine = decode . B8.readInt

    decode (Just (x, _)) = x
    decode Nothing = error "Invalid input: expected integer."

  compute :: [Int] -> [Int]
  compute = concatMap _ . assocs . countingSort . map _ where
    countingSort :: [(Int, Int)] -> UArray Int Int
    countingSort = accumArray (+) 0 (lower, upper)

    lower = 0
    upper = 1000000

  output :: [Int] -> B8.ByteString
  output = toLazyByteString . foldMap perCase where
    perCase :: Int -> Builder
    perCase x = intDec x <> char7 '\n'

For the first remaining hole, each time we encounter an i, we want to increase its count by 1, so we want to add (i,1) to the output list. For the last remaining hole, if we get back an association (i,c), we want to produce a list of c copies of i, which is replicate. So, we end up with:

import Data.Array.IArray (accumArray, assocs)
import Data.Array.Unboxed (UArray)
import Data.ByteString.Builder (Builder, char7, intDec, toLazyByteString)
import qualified Data.ByteString.Lazy.Char8 as B8
import Data.Monoid ((<>))

main :: IO()
main = B8.interact ( output . compute . input ) where
  input :: B8.ByteString -> [Int]
  input = map perLine . tail . B8.lines where
    perLine = decode . B8.readInt

    decode (Just (x, _)) = x
    decode Nothing = error "Invalid input: expected integer."

  compute :: [Int] -> [Int]
  compute = concatMap expand . assocs . countingSort . map encode where
    encode i = (i, 1)
    
    countingSort :: [(Int, Int)] -> UArray Int Int
    countingSort = accumArray (+) 0 (lower, upper)

    lower = 0
    upper = 1000000

    expand (i,c) = replicate c i

  output :: [Int] -> B8.ByteString
  output = toLazyByteString . foldMap perCase where
    perCase :: Int -> Builder
    perCase x = intDec x <> char7 '\n'

I submitted a few different variants of it, for benchmarking. Subtle differences can have a big impact on performance, and since Haskell is such a high-level language, it’s often opaque which ones. (It took me a while to figure out that foldMap replicate was twice as fast as Data.Semigroup.stimes, and that’s probably a quirk in the implementation.) The main difference with this version is that it replicates the output of perCase rather than the inputs. Or you might prefer this more concise style.

You’ll notice that it’s possible to pass flags to GHC inside the source code. I haven’t found that -O3 makes a big difference, but it can’t hurt. I would use -llvm as well, since the LLVM code generator is better, but CodeChef doesn’t support it.