速いnubが欲しい.(Haskell)

The Haskell 98 Library Report: List Utilities
によると,

nub              :: (Eq a) => [a] -> [a]
nub []           =  []
nub (x:xs)       =  x : nub (filter (\y -> not (x == y)) xs)

なので,実はnubはO(n^2). Eq だけ,だから仕方ないのかもしれないが.


しかし,

nub :: (Ord a) => [a] -> [a]

にすれば,O(n log n) にできる.Setのmember, insertは log n であることを利用する.

import Data.Set (empty, member, insert)

nub' :: (Ord a) => [a] -> [a]
nub' l = nub'' l empty
    where nub'' [] _         = []
          nub'' (x:xs) s
              | x `member` s = nub'' xs s
              | otherwise    = x : nub'' xs (x `insert` s)


O(n^2) と O(n log n) では大違いです. cf. sort



ところで,ソース
http://haskell.org/ghc/docs/latest/html/libraries/base/src/Data-List.html#nub
を見てみたら,#ifdef USE_REPORT_PRELUDE なんて文がある.どうやらnubの実装には2種類あるみたい.
しかし,自分のghcはどちらを使っているのか?

nub                     :: (Eq a) => [a] -> [a]
#ifdef USE_REPORT_PRELUDE
nub                     =  nubBy (==)
#else
-- stolen from HBC
nub l                   = nub' l []             -- '
  where
    nub' [] _           = []                    -- '
    nub' (x:xs) ls                              -- '
        | x `elem` ls   = nub' xs ls            -- '
        | otherwise     = x : nub' xs (x:ls)    -- '
#endif

-- | The 'nubBy' function behaves just like 'nub', except it uses a
-- user-supplied equality predicate instead of the overloaded '=='
-- function.
nubBy                   :: (a -> a -> Bool) -> [a] -> [a]
#ifdef USE_REPORT_PRELUDE
nubBy eq []             =  []
nubBy eq (x:xs)         =  x : nubBy eq (filter (\ y -> not (eq x y)) xs)
#else
nubBy eq l              = nubBy' l []
  where
    nubBy' [] _         = []
    nubBy' (y:ys) xs
       | elem_by eq y xs = nubBy' ys xs
       | otherwise       = y : nubBy' ys (y:xs)

-- Not exported:
-- Note that we keep the call to `eq` with arguments in the
-- same order as in the reference implementation
-- 'xs' is the list of things we've seen so far, 
-- 'y' is the potential new element
elem_by :: (a -> a -> Bool) -> a -> [a] -> Bool
elem_by _  _ []         =  False
elem_by eq y (x:xs)     =  y `eq` x || elem_by eq y xs
#endif


# log は定数だから, O(n log n) => O(n).やったリニアだ.

起動時に壁紙をランダムに設定する(Openbox).

壁紙の設定には Feh - ArchWiki を使っている.
壁紙達が ~/pic/wallpapers/ 以下に入っているとして,
autostart.sh に以下の用な記述を追加.

feh --bg-scale $( find ~/pic/wallpapers/ -type f -or -type l | ~/bin/random-select )

画像へのシンボリックリンクも含まれているので, -type l オプションを追加した.

random-select は ランダムに1行返すhaskellスクリプト

import System.Random

main = do xs <- return.lines =<< getContents
          putStrLn.((!!) xs) =<< getStdRandom (randomR (0,length xs - 1))
追記

shufというコマンドがあって,そっちを使えば良い.
車輪の再発明をしていたようだ

feh --bg-scale $( find ~/pic/wallpapers/ -type f -or -type l | shuf -n 1)

locate -S

locate -S

-S, --statistics
Write statistics about each read database to standard output instead of searching for files and
exit successfully.

ということで,実行してみた(desktop pc, Archlinux).

Database /var/lib/mlocate/mlocate.db:
        37072 directories
        477783 files
        38478877 bytes in file names
        10420776 bytes used to store database

うん.いまいち,どう解釈して良いのやら.

ちなみに,ほとんど起動しない ubuntu でやってみたら表示形式が違った.

Database /var/cache/locate/locatedb is in the GNU LOCATE02 format.
Locate database size: 673886 bytes
All Filenames: 67083
File names have a cumulative length of 2788565 bytes.
Of those file names,

   193 contain whitespace,
   0 contain newline characters,
   and 10 contain characters with the high bit set.
Compression ratio 75.83% (higher is better)

どうやら,違うlocateだったみたい.
locate --versionの出力.

Archlinux
mlocate 0.22
Copyright (C) 2007 Red Hat, Inc. All rights reserved.
This software is distributed under the GPL v.2.

This program is provided with NO WARRANTY, to the extent permitted by law.
Ubuntu
locate (GNU findutils) 4.4.0
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Built using GNU gnulib version e5573b1bad88bfabcda181b9e0125fb0c52b7d3b