Konta, my woob->beancount accounting processing pipeline

This article can be seen as a follow-up to my first post about Plaintext Accounting, which explored how to fetch, process and save my bank-related data in a plaintext file so that I have an easy time accessing and manipulating it.

There are 2 main topics that changed since then, which can be summarized as “Accounting is really boring, so it’s really nice to automate it.”

That was also a nice occasion for me to play with the Python packaging ecosystem, which really only keeps on growing more and more complicated; until our Julia overlords take over, packaging Python native modules is in my opinion the best way to have “business-oriented” people write hacky code to test features leaning on a more robust and statically typed backend.

Now at least I can use a library to directly fetch my financial data and process it in one go to beancount directives to automate almost all bookkeeping regularly.

Building entries for mortgages

I bought a home recently (yay), so now I have to incorporate this huge transaction and all the mortgage calendar information into my beancount files (nay).

Different people have different recipes to deal with this, but my main priorities when tracking this transaction were:

  • I want to keep the value of the house at the time of transaction
  • I want to keep track of how much interest I pay to the bank for the loan; when considering whether we want to repay part of the capital early or not, it’s going to be important (spoiler: given our rate and the inflation right now, we don’t expect to do it any time soon, but still.)
  • I want to keep track of how much extra fees I paid on top of the house; it’s basically the cost of the transaction, which is going to be an important consideration if we want to move and resell.
2018-02-19 open Assets:Island
2018-02-19 open Expenses:Bank:LoanInterest
2018-02-19 open Liabilities:Loan:RedactedAddress

2018-02-19 * "Previous owners" "Getting an island"
  ; Value at time of purchase
  Assets:Island                                      200000.00 EUR
  ; Various fees around the transaction
  ; I don't care about the details honestly
  Expenses:Bank:Fees                                  3000.00 EUR
  ; Total of loan value
  Liabilities:Mortgage:Island                      -150000.00 EUR
  ; All the rest came from a savings account basically
  Assets:Bank:FooSavings

Once the transaction is done, we know that we have a mortgage (with a fixed rate for the whole duration of X years), so the bank actually gives us a reimbursement calendar so that we know how much pricipal is left and how much interest is paid each month. It’s kind of a chore to insert everything in beancount format somewhere, so I made a tool that computes this calendar from the loan characteristics, and outputs beancount format. The name is Bean Mortgage and allows me to keep in a draft somewhere all the transactions, and only edit them in when I see them happening on the actual account.

bean-mortgage 150000.00 120 0.0180 \
               -c EUR \
               -l Liabilities:Mortgage:Island \
               -e Expenses:Fees:Mortgage:Island \
               -a Assets:Bank:Checking \
               -p "Banque de France" \
               -d "Loan Island" \
               -f 2018-03-19

will produce this kind of output:

2018-03-19 ! "Banque de France" "Loan Island - payment 1"
  Assets:Bank:Checking              -1366.81 EUR
  Expenses:Fees:Mortgage:Island     225.00 EUR
  Liabilities:Mortgage:Island       1141.81 EUR


2018-03-19 balance Liabilities:Mortgage:Island -148858.19 EUR

2018-04-19 ! "Banque de France" "Loan Island - payment 2"
  Assets:Bank:Checking              -1366.81 EUR
  Expenses:Fees:Mortgage:Island     223.29 EUR
  Liabilities:Mortgage:Island       1143.52 EUR


2018-04-19 balance Liabilities:Mortgage:Island -147714.67 EUR

; A lot more...

2028-02-19 ! "Banque de France" "Loan Island - payment 120"
  Assets:Bank:Checking              -1366.55 EUR
  Expenses:Fees:Mortgage:Island     2.05 EUR
  Liabilities:Mortgage:Island       1364.50 EUR


2028-02-19 balance Liabilities:Mortgage:Island 0 EUR

Swapping the correct amounts, dates, and account names produced my calendar, that I saved in an auxiliary file and I just pull it whenever I see the matching transacation. It could be better of course:

  • Beancount has a proper plugin utility that might allow me to automatically detect the transaction when I insert it, and modify it in place instead of needing me manually changing.
  • I also noticed while doing this post that the balance directive should be on the day after (the balance directive is considered to happen before the transactions)

But I don’t deem the returns to be worth the time investment for now, maybe later.

Processing Woob data directly into Beancount

I’m lucky enough to have all my accounts covered by Woob (don’t @me about the former name and older controversies). It’s a fantastic scrapping toolbox with modules, it’s covered in the first article linked at the beginning.

I had 2 main issues with my current processing pipeline:

  • I had to manually call woob bank history ID and woob bank coming ID for each account with a long set of flags to produce the data I wanted.
  • My “quick and dirty” python script was just that. Quick, and dirty, so dealing with changes was not pleasant and having everything in the same place is not portable enough for me.

Scheme version

My first go at the solving the portability issue was to make a Guile scheme binary called konta to properly fetch the config and process the transactions. Using a scheme means that it’s trivial to configure the pipeline, just make a scm file and set dynamic variables there. It is also very easy to just write some custom code to match specific woob transactions to create specific beancount transaction:

;;; Konta configuration.
;; Evaluating this file will fail if (konta config) hasn't been loaded
(begin
  (use-modules (konta config))
  (use-modules (konta entry))
  (use-modules (konta transform))
  (use-modules (srfi srfi-9))
  (use-modules (srfi srfi-9 gnu))
  ;; alist linking a payee to an account
  (%payee->category '(("75 MONOPRIX" . "Expenses:Food:Supermarché")
                      ("75 MONOP" . "Expenses:Food:Supermarché")
                      ("75 RATP" . "Expenses:Transports:Transports-en-commun")
                      ("75 SNCF INTERNET" . "Expenses:Transports:Train")
                      ("Relevé différé Carte CARDNUMBER" . "Liabilities:CreditCard")
                      ;; ...
                      (default . "Expenses:Uncategorized")))
  ;; alist linking a payee to a name, useful to coalesce various aliases/translate things
  (%payee->name '(("75 MONOPRIX" . "MONOPRIX")
                  ("75 MONOP" . "MONOPRIX")
                  ;; Rename myself whenever I appear
                  ("AGBOBADA Gerry" . "M Gerry Agbobada")
                  ("Relevé différé Carte CARDNUMBER" . "M Gerry Agbobada")))

  ;; Configuration of the main processing function
  (%woob->beancount
   (lambda (woob)
     (let ((p2c (%payee->category))
           (p2n (%payee->name))
           (default-entry (default-woob->beancount woob)))
       (cond
        ;; Automatic transfer to savings detection
        ((string=? (woob-entry-label woob) "SAVINGS")
         (set-beancount-entry-expense-type default-entry "Assets:Savings:Foo")
         (when (string=? (woob-entry-raw-value woob) "SAVINGS AUTOMATIC TRANSFER")
           (set-beancount-entry-details default-entry "Periodic transfer to savings"))
         default-entry)

        ;; Use other cond clauses to make specific transactions

        ;; If nothing matched, use the default function
        (#t default-entry))))))

The parameters allow me to write data validation in Guile on the module side, and have all the configuration handled in a single scheme file. And scheme is a very cool language to work with (main reason I chose to write a R7RS interpreter in Rust for my pet project). The code lives in an unlisted repository for now, with a lot of french doc.

But the scheme version did not solve the first issue with my pipeline: I still needed to shell out at some point to fetch the trasaction using woob and writing to multiple csv files. This arguably could have been solved in Guile directly, but I have this Python itch since I work with it a bit, and both beancount and woob are python libraries, so I decided to rewrite it in Python to eventually be able to stream the data in one go.

Python version

After having done a few imports with my Guile tool, I decided that having to call woob manually with all the accounts was not optimal enough, and if my accounting ingestion process is not easy, I’ll slack, so I rewrote it in python.

I also used Flit to package it and make a proper PyPI package (even if it does have way less doc for now). I’ve been using this now and I see myself using this pipeline and expanding it for a while. The main features are:

a single command to process everything
PyKonta will automatically read the same woob configuration as the user has in CLI to pull the data, process it, and output to stdout a list of beancount directives merged and sorted by date. This is a huge improvement because that is wayyy less steps to deal with.
a Python library and binary
much to my regret, a python blob is more portable than a Guile scheme blob, when it comes to installing from common distributions/OS.
a TOML configuration
having a toml configuration instead of a scheme file means that it’s eventually going to be easier to make some sort of GUI on top of the library to change the configuration (which really is a set of matching rules) in an intuitive interface.
     [Default]
     # Setting the category for unmatched payees.
     category = "Expenses:Uncategorized"

     [Accounts]
     # Woob ID (from woob bank ls) to Beancount account
     "deadbeef0731233" = { name = "Assets:Bank:Checking", default-currency = "EUR" }

     [Payees]
     "75 MONOP" = { name = "MONOPRIX", category = "Expenses:Food:Supermarché" }
     "75 MONOPRIX" = { name = "MONOPRIX", category = "Expenses:Food:Supermarché" }

     # Unused for now, but extra verifications can be done if we
     # give the main input file, with valid/existing accounts
     [Beans]
     input = []

I use an account that doesn’t exist for the default category, this way bean-check forces me to deal with unmatched operations.

Thanks to this, my current pipeline is pretty straightforward:

  1. Call python -m konta and pipe the output to an import.beancount file.
  2. Check the import file to see if new rules should be added, and reimport the file.
  3. Fix the few uncategorized transactions (usually the Paypal/Amazon related transfers).
  4. Paste the import in my main file, add balance assertions, and fix the inconsistencies.

Usually this takes me 10/15 minutes if I do this weekly, but step 4 can really be a pain if I skip this during 3+ months (this never occurred to me of course /s). Each minute saved there really pays off, as it reduces the likelihood of me “forgetting” to do it.

Caveats

The configuration currently lacks the flexibitility of definining a custom matcher with code, but I’m planning to fix this eventually (probably with a manipulation of PYTHONPATH and naming custom matchers as python modules/functions). The binary also lacks a lot of polish, it should have better CLI flags and example configurations. I probably also want tests at some point, but mocking woob responses might be tough. Konta also adds an extra `id` key to transactions, but I probably will change the way it’s handled, as I can’t use the current system to merge 2 transactions that represent a transfer between 2 accounts I own.

Closing words

There are a lot of things that could be better with this Python proof of concept, but if I wait for things to be perfect before talking about it then I’ll never talk. Once I fix a few of the obvious usability issues I see with this, I’ll make a proper README documentation for PyPI and start trying to convince other people to try it.

I want to help people (myself included) to have a better view of their finance without having this data/intelligence locked by our banks features. Having a good plaintext accounting ingestion process is, in my opinion, the main barrier to entry into the plaintext accounting world, so this post is my contribution to show an example of how to deal with this issue.

This particular post might be mostly useful to French people though, as woob has a pretty strong focus towards french banking institutions; Budget Insight participates to woob bank modules maintenance as an important part of their product if I understand correctly.

I hope it’s going to be useful to some people :)

Take care, Gerry

 Share!