[input-output-hk/cardano-sl] [CO-347] Wallet’s UTXO histogram view (#3402)

KtorZ requested changes on this pull request.

> + | ErrHistogramUpperBoundsNegative
+ | ErrAllStakesNegative
+ deriving (Show)
+
+validateUtxoStatistics :: HashMap Text Word64 -> Word64 -> Either UtxoStatisticsError (HashMap Text Word64, Word64)
+validateUtxoStatistics histogram allStakes
+ | histogramBinNumCond histogram = Left ErrHistogramEmpty
+ | histogramKeysCond histogram = Left ErrHistogramNamesInvalid
+ | histogramValsCond histogram = Left ErrHistogramUpperBoundsNegative
+ | allStakesCond allStakes = Left ErrAllStakesNegative
+ | otherwise = Right (histogram, allStakes)
+ where
+ histogramBinNumCond histo = (length $ HMS.keys histo) <= 0 + validateKeys = any (\key -> notElem key $ map show (NL.toList $ generateBounds Log10) )
+ histogramKeysCond = validateKeys . HMS.keys
+ validateVals = any (< 0) `<= 0` ? Are we fine with zero values? Or are we interested to catch any non-positive value ? > + deriving (Show)
+
+validateUtxoStatistics :: HashMap Text Word64 -> Word64 -> Either UtxoStatisticsError (HashMap Text Word64, Word64)
+validateUtxoStatistics histogram allStakes
+ | histogramBinNumCond histogram = Left ErrHistogramEmpty
+ | histogramKeysCond histogram = Left ErrHistogramNamesInvalid
+ | histogramValsCond histogram = Left ErrHistogramUpperBoundsNegative
+ | allStakesCond allStakes = Left ErrAllStakesNegative
+ | otherwise = Right (histogram, allStakes)
+ where
+ histogramBinNumCond histo = (length $ HMS.keys histo) <= 0 + validateKeys = any (\key -> notElem key $ map show (NL.toList $ generateBounds Log10) )
+ histogramKeysCond = validateKeys . HMS.keys
+ validateVals = any (< 0) + histogramValsCond = validateVals . HMS.elems + allStakesCond = (< 0) Interesting way of doing it. For those, I usually leverage the `Monad` instance of `Either` and the `when` combinator. This way, we avoid the hassle and overhead of defining extra identifiers. I believe it reads a bit better: ```hs validateUtxoStatistics :: HashMap Text Word64 -> Word64
-> Either UtxoStatisticsError (HashMap Text Word64, Word64)
validateUtxoStatistics histogram allStakes = do
let (keys, elems) = (HMS.keys histogram, HMS.elems histogram)

let acceptedKeys = show <$> (toList $ generateBounds Log10)

when (length keys <= 0) $ Left ErrHistogramEmpty when (any (flip notElem acceptedKeys) keys) $ Left ErrHistogramNamesInvalid when (any (< 0) elems) $ Left ErrHistogramUpperBoundNegative when (allStakes < 0) $ Left ErrAllStakesNegative Right (histogram, allStakes) ``` > +
+— Buckets boundaries can be constructed in different way
+data BoundType = Log10 | Haphazard
+
+generateBounds :: BoundType -> NonEmpty Word64
+generateBounds bType =
+ let (^!) :: Word64 -> Word64 -> Word64
+ (^!) = (^)
+ in case bType of
+ Log10 -> NL.fromList $ ( map (\toPower -> 10 ^! toPower) [1..16] ) ++ [45 * (10 ^! 15)]
+ Haphazard -> NL.fromList [10, 100, 1000, 10000]
+
+instance Arbitrary HistogramBar where
+ arbitrary = do
+ possiblenames <- elements $ map show (NL.toList $ generateBounds Log10) + bound <- arbitrary This one isn't entirely arbitrary, it is `suchThat (\n -> n <= upperBound)` > +import Pos.Infra.Util.LogSafe (BuildableSafeGen (..),
+ deriveSafeBuildable)
+
+
+— Utxo statistics for the wallet.
+— Histogram is composed of bars that represent the bucket. The bucket is tagged by upper bound of a given bucket.
+— The bar value corresponds to the number of stakes
+— In the future the bar value could be different things:
+— (a) sum of stakes in a bucket
+— (b) avg or std of stake in a bucket
+— (c) topN buckets
+— to name a few
+data HistogramBar = HistogramBarCount
+ { bucketName :: !Text
+ , bucketUpperBound :: !Word64
+ } deriving (Show, Eq, Ord, Generic)

There’s a confusion in the field names here :upside_down_face: , took me a short while to figure it out from the rest of the code.
`bucketUpperBound` actually refers to the value / number of stakes whereas `bucketName` is the actual upper bound.

Hence the following question: what is the need for the bound (i.e. `bucketName` here) to be a `Text`? Why isn’t this a `Word64`, especially when it is generated using `Word64` values and converted to Text using `show`. If it’s about using it as a key in the JSON representation, having a `Word64` here shouldn’t be much of a problem!

> @@ -0,0 +1,219 @@
+{-# LANGUAGE DeriveGeneric #-}
+{-# LANGUAGE LambdaCase #-}
+
+module Cardano.Wallet.Types.UtxoStatistics
+ ( computeUtxoStatistics
+ , UtxoStatistics (..)
+ , HistogramBar (..)

Not sure we want to expose the internals to the outside world. I’d be in favor of opaque types with smart-constructors :+1:

> + %» upperBound=»%build
+ %» }»)
+ bucketName
+ bucketUpperBound
+
+
+data UtxoStatistics = UtxoStatistics
+ { theHistogram :: ![HistogramBar]
+ , theAllStakes :: !Word64
+ } deriving (Show, Generic, Ord)
+
+toMap :: [HistogramBar] -> Map Text Word64
+toMap = Map.fromList . map (\(HistogramBarCount key val) -> (key,val))
+
+instance Eq UtxoStatistics where
+ (UtxoStatistics h s) == (UtxoStatistics h’ s’) = s == s’ && toMap h == toMap h’

Converting to `Map` seems overkill, comparing sorted list should be enough I believe? Plus, if we consider the «performance» argument (it doesn’t really matter here anyway), it’s at least `O(n*log n)` to create and compare both. Maps whereas at most `O(n)` to sort and compare both lists.

> +
+data UtxoStatistics = UtxoStatistics
+ { theHistogram :: ![HistogramBar]
+ , theAllStakes :: !Word64
+ } deriving (Show, Generic, Ord)
+
+toMap :: [HistogramBar] -> Map Text Word64
+toMap = Map.fromList . map (\(HistogramBarCount key val) -> (key,val))
+
+instance Eq UtxoStatistics where
+ (UtxoStatistics h s) == (UtxoStatistics h’ s’) = s == s’ && toMap h == toMap h’
+
+instance ToJSON UtxoStatistics where
+ toJSON (UtxoStatistics bars allStakes) =
+ let histogramObject = Object . HMS.fromList . map extractBarKey
+ extractBarKey (HistogramBarCount bound stake) = bound .= stake

That’s what lead to the conclusions about the actual meaning of `bucketUpperBound` and `bucketName` in the comments above.

> +
+
+deriveSafeBuildable »HistogramBar
+instance BuildableSafeGen HistogramBar where
+ buildSafeGen _ HistogramBarCount{..} =
+ bprint («{»
+ %» name=»%build
+ %» upperBound=»%build
+ %» }»)
+ bucketName
+ bucketUpperBound
+
+
+data UtxoStatistics = UtxoStatistics
+ { theHistogram :: ![HistogramBar]
+ , theAllStakes :: !Word64

Can’t `allStakes` be derived from the list of histogram already? I mean, it’s a nice-to-have in the API response, that’s for sure, but I think here we better have a

«`hs
allStake :: UtxoStatistics -> Word64
«`

GHC-runtime is rather good at caching these kind of results anyway. And we only need it when serializing to JSON right?

> + deriving (Show)
+
+validateUtxoStatistics :: HashMap Text Word64 -> Word64 -> Either UtxoStatisticsError (HashMap Text Word64, Word64)
+validateUtxoStatistics histogram allStakes
+ | histogramBinNumCond histogram = Left ErrHistogramEmpty
+ | histogramKeysCond histogram = Left ErrHistogramNamesInvalid
+ | histogramValsCond histogram = Left ErrHistogramUpperBoundsNegative
+ | allStakesCond allStakes = Left ErrAllStakesNegative
+ | otherwise = Right (histogram, allStakes)
+ where
+ histogramBinNumCond histo = (length $ HMS.keys histo) <= 0 + validateKeys = any (\key -> notElem key $ map show (NL.toList $ generateBounds Log10) )
+ histogramKeysCond = validateKeys . HMS.keys
+ validateVals = any (< 0) + histogramValsCond = validateVals . HMS.elems + allStakesCond = (< 0) Interesting way of writing this ^.^ When dealing with `Either` like this, I usually go for the `Monad` approach and leverage the `when` combinator. It avoids all the hassle of defining extra identifiers (making sure variables don't conflicts etc..) and is, to my opinion, a bit more readable: ```hs validateUtxoStatistics :: HashMap Text Word64 -> Word64
-> Either UtxoStatisticsError (HashMap Text Word64, Word64)
validateUtxoStatistics histogram allStakes = do
let (keys, elems) = (HMS.keys histogram, HMS.elems histogram)

let acceptedKeys = show <$> (toList $ generateBounds Log10)

when (length keys <= 0) $ Left ErrHistogramEmpty when (any (flip notElem acceptedKeys) keys) $ Left ErrHistogramNamesInvalid when (any (< 0) elems) $ Left ErrHistogramUpperBoundNegative when (allStakes < 0) $ Left ErrAllStakesNegative Right (histogram, allStakes) ``` Also, this function looks like a good candidate for a smart constructor and could return an actual `UtxoStatistics` rather than an HashMap and a Word64. You can find some inspiration here maybe: https://github.com/input-output-hk/cardano-sl/blob/Squad1/CO-325/api-v1-improvements/wallet/src/Pos/Util/Mnemonic.hs#L126-L149 https://github.com/input-output-hk/cardano-sl/blob/Squad1/CO-325/api-v1-improvements/wallet/src/Pos/Util/Mnemonic.hs#L350-L363 (NOTE: We might want to expose this `eitherToParse` function somewhere else ...) > + & at «100» ?~ wordRef
+ & at «1000» ?~ wordRef
+ & at «10000» ?~ wordRef
+ & at «100000» ?~ wordRef
+ & at «1000000» ?~ wordRef
+ & at «10000000» ?~ wordRef
+ & at «100000000» ?~ wordRef
+ & at «1000000000» ?~ wordRef
+ & at «10000000000» ?~ wordRef
+ & at «100000000000» ?~ wordRef
+ & at «1000000000000» ?~ wordRef
+ & at «10000000000000» ?~ wordRef
+ & at «100000000000000» ?~ wordRef
+ & at «1000000000000000» ?~ wordRef
+ & at «10000000000000000» ?~ wordRef
+ & at «45000000000000000» ?~ wordRef

It reads better this way indeed :+1:

> + UtxoStatistics
+ <$> populateBuckets bounds
+ <*> L.sum
+
+populateBuckets :: NonEmpty Word64 -> L.Fold Word64 [HistogramBar]
+populateBuckets bounds =
+ L.Fold (addCountInBuckets $ head bounds) (initalizeMap bounds)
+ (fmap (\(x1, x2) -> HistogramBarCount (T.pack $ show x1) x2) . Map.toList)
+ where
+ initalizeMap :: NonEmpty Word64 -> Map.Map Word64 Word64
+ initalizeMap b = Map.fromList $ NL.toList $ NL.zip b (NL.repeat 0)
+ addCountInBuckets :: Word64 -> Map.Map Word64 Word64 -> Word64 -> Map.Map Word64 Word64
+ addCountInBuckets thefirst acc entry =
+ case Map.lookupGE entry acc of
+ Just (k, v) -> Map.insert k (v+1) acc
+ Nothing -> Map.adjust (+1) thefirst acc

From `foldl` documentation, we find this:

«`hs
Fold (x -> a -> x) x (x -> b) — Fold step initial extract
«`

When there’s a clear structure like this, it’s usually a good idea to have the implementation follows this structure as it helps understanding (same for `bracket` for instance).

What about something like:

«`hs
foldBuckets :: NonEmpty Word64 -> L.Fold Word64 [HistogramBar]
foldBuckets bounds =
let
step x a =
case Map.lookupGE a x of
Just (k, v) -> Map.insert k (v+1) x
Nothing -> Map.adjust (+1) (head bounds) x

initial =
Map.fromList $ zip (NL.toList bounds) (repeat 0)

extract =
map (uncurry HistogramBar) . Map.toList
in
L.Fold step initial extract
«`

> + ]
+
+instance Arbitrary TxIn where
+ arbitrary = oneof
+ [ TxInUtxo <$> arbitrary <*> arbitrary
+ , TxInUnknown <$> choose (1, 255) <*> scale (min 150) arbitrary
+ ]
+ shrink = genericShrink
+
+instance Arbitrary TxOutAux where
+ arbitrary = genericArbitrary
+ shrink = genericShrink
+
+instance Arbitrary TxOut where
+ arbitrary = genericArbitrary
+ shrink = genericShrink

What does all this has to do with the PR :open_mouth: ? Maybe an artifact from a rebase / merge conflict ?

Добавить комментарий