How to speed up RAID (5-6) growing with mdadm ?
   Posted:


Yesterday, I added my 11th disk to my RAID 6 array. As the last time it took my more than 20 hours, I spent some time investigating how to speed things up and this post contains some tips on how to achieve good grow performances. With these tips, I have been able to reach a speed of about 55K in average during reshape. It did finish in about 13 hours.

First, take into account that some of these tips may depend on your configuration. In my case, this server is only used for this RAID, so I don't care if the CPU is used a lot during rebuild or if other processes are suffering from the load. This may not be the case with your configuration. Moreover, I speak only of hard disks, if you use SSD RAID, there are probably better way of tuning the rebuild (or perhaps it is fast enough). Finally, you have know that a RAID reshape is going to be slow, there is no way you'll grow a 10+ RAID array in one hour. G

In the examples, I use /dev/md0 as the raid array, you'll have to change this to your array name.

The first 3 tips can be used even after the rebuild has started and you should the differences in real-time. But, these 3 tips will also be erased after each reboot.

Increase speed limits

The easiest thing to do is to increase the system speed limits on raid. You can see the current limits on your system by using these commands:

sysctl dev.raid.speed_limit_min
sysctl dev.raid.speed_limit_max

These values are set in Kibibytes per second (KiB/s).

You can put them to high values:

sysctl -w dev.raid.speed_limit_min=100000
sysctl -w dev.raid.speed_limit_max=500000

At least with these values, you won't be limited by the system.

Increase stripe cache size

By allowing the array to use more memory for its stripe cache, you may improve the performances. In some cases, it can improve performances by up to 6 times. By default, the size of the stripe cache is 256, in pages. By default, Linux uses 4096B pages. If you use 256 pages for the stripe cache and you have 10 disks, the cache would use 10*256*4096=10MiB of RAM. In my case, I have increased it to 4096:

echo 4096 > /sys/block/md0/md/stripe_cache_size

The maximum value is 32768. If you have many disks, this may well take all your available memory. I don't think values higher than 4096 will improve performance, but feel free to try it ;)

Increase read-ahead

If configured too low, the read-ahead of your array may make things slower.

You can see get the current read-ahead value with this command:

blockdev --getra /dev/md0

These values are in 512B sector. You can set it to 32MB to be sure:

blockdev --setra 65536 /dev/md0

This can improve the performances, but don't expect this to be a game-changer unless it was configured really low at the first place.

Bonus: Speed up standard resync with a write-intent bitmap

Although it won't speed up the growing of your array, this is something that you should do after the rebuild has finished. Write-intent bitmaps is a kind of map of what needs to be resynced. This is of great help in several cases:

  • When the computer crash (power shutdown for instance)
  • If a disk is disconnected, then reconnected.

In these case, it may totally avoid the need of a rebuild which is great in my opinion. Moreover, it does not take any space on the array since it uses space that is not usable by the array.

Here is how to enable it:

mdadm --grow --bitmap=internal /dev/md0

However, it may cause some write performance degradation. In my case, I haven't seen any noticeable degradation, but if it is the case, you may want to disable it:

mdadm --grow --bitmap=none /dev/md0

Bonus: Monitor rebuild process

If you want to monitor the build process, you can use the watch command:

watch cat /proc/mdstat

With that you'll see the rebuild going in real-time.

You can also monitor the I/O statistics:

watch iostat -k 1 2

Bonus: How to grow a RAID 5-6 array

As a sidenote, this section indicates how to grow an array. If you want to add the disk /dev/sdl to the array /dev/md0, you'll first have to add it:

mdadm --add /dev/md0 /dev/sdl

This will add the disk as a spare disk. If you had 5 disks before, you'll want to grow it to 6:

mdadm --grow --backup-file=/root/grow_md0_backup_file --raid-devices=6 /dev/md0

The backup file must be on another disk of course. The backup file is optional but improves the chance of success if you have a power shutdown or another form of unexpected shutdown. If you know what you're doing, you can grow it without backup-file:

mdadm --grow --raid-devices=6 /dev/md0

This command will return almost instantly, but the actual reshape won't likely be finished for hours (maybe days). kkkkkkkkkkkkk

Once the rebuild is finished, you'll still have to extend the partitions with resize2fs. If you use LVM on top of the array, you'll have to resize the Physical Volume (PV) first:

pvresize /dev/md0

and then extend the Logical Volume (s) (LV). For instance, if you want to add 1T to a LV named /dev/vgraid/work:

vgextend -r -L+1T /dev/vgraid/work

The -r option will automatically resize the underlying filesystem. Otherwise, you'd still have to resize it with resize2fs.

Conclusion

These are the changes I have found that speed up the reshape process. There are others that you may test in your case. For instance, in some systems disabling NCQ on each disk may help.

I hope that these tips will help you doing fast rebuilds in your RAID array :)

Comments

Named Optional Template parameters to configure a class at compile-time
   Posted:


In this post, I'll describe a technique that can be used to configure a class at compile-time when there are multiple, optional parameters, with default values to this class. I used this technique in my dll project to configure each instance of Restricted Boltzmann Machine.

The technique presented here will only work with C++11 because of the need for variadic template. This could be emulated without them by fixing a maximum number of parameters, but I won't go into this in this post.

The problem

For this post, we'll take the case of a single class, let's call it configurable. This class has several parameters:

  • A of type int
  • B of type char
  • C of an enum type
  • D of type bool
  • E is a type
  • F is a template type

This class could simply be written as such:

enum class type {
    AAA,
    BBB,
    CCC
};

template<int T_A = 1, char T_B = 'b', type T_C = type::BBB, bool T_D = false, typename T_E = watcher_1, template<typename> class T_F = trainer_1>
struct configurable_v1 {
    static constexpr const int A = T_A;
    static constexpr const char B = T_B;
    static constexpr const type C = T_C;
    static constexpr const bool D = T_D;

    using E = T_E;

    template<typename C>
    using F = T_F<C>;

    //Something useful
};

and used simply as well:

using configurable_v1_t = configurable_v1<100, 'z', type::CCC, true, watcher_2, trainer_2>;

This works well and nothing is wrong with this code. However, if you want all default values but the last one, you have to specify each and every one of the previous template parameters as well. The first disadvantage is that it is verbose and tedious. Secondly, instead of using directly the default values implicitly, you have specified them. This means that if the default values are changed by the library authors or even by you in the configurable_v1 class, either all the usages will be out of sync or you'll have to update them. And again, this is not practical. Moreover, if the author of the configurable_v1 template adds new template parameters before the last, you'll have to update all the instantiation points as well.

Moreover, here we only have 6 parameters, if you have more, the problem becomes even worse.

The solution

What can we do to improve over these problems ? We are going to use variadic template parameters in the configurable class and use simple classes for each possible parameters. This will be done in the configurable_v2 class. At the end you could use the class as such:

using configurable_v2_t1 = configurable_v2<a<100>, b<'z'>, c<type::CCC>, d, e<watcher_2>, f<trainer_2>>;
using configurable_v2_t2 = configurable_v2<f<trainer_2>>;

You can note, that on the second line, we only specified the value for the last parameter without specifiyng any other value :) This is also much more flexible since the order of the parameters has absolutely no impact. Here, for the sake of the example, the parameters are badly named, so it is not very clear what this do, but in practice, you can give better names to the parameters and make the types more clear. Here is an example from my dll library:

using rbm_t = dll::rbm_desc<
    28 * 28, 200,
   dll::batch_size<25>,
   dll::momentum,
   dll::weight_decay<>,
   dll::visible<dll::unit_type::GAUSSIAN>,
   dll::shuffle,
   dll::weight_type<float>
>::rbm_t;

rbm_desc is class that is configurable with this technique, expect that the first two parameters are mandatory and not named. I personally thinks that this is quite clear, but of course I may be biased ;)

So let's code!

The class declaration is quite simple:

template<typename... Args>
struct configurable_v2 {
    //Coming
}

We will now have to exact values and types from Args in order to get the 4 values, the type and the template type out of Args.

Extracting integral values

We will start with the parameter a that holds a value of type int with a default value of 1. Here is one way of writing it:

struct a_id;

template<int value>
struct a : std::integral_constant<int, value> {
    using type_id = a_id;
};

So, a is simply an integral constant with another typedef type_id. Why do we need this id ? Because a is a type template, we cannot use std::is_same to compare it with other types, since its value is part of the type. If we had only int values, we could easily write a traits that indicates if the type is a specialization of a, but since will have several types, this would be a real pain to do and we would need such a traits for each possible type. Here the simple way to go is to add inner identifiers to each types.

We can now write a struct to extract the int value for a from Args. Args is a list of types in the form parameter_name<parameter_value>... . We have to find a specialization of a inside this list. If such a specialization is present, we'll take its integral constant value as the value for a, otherwise, we'll take the default values. Here is what we want to do:

template<typename... Args>
struct configurable_v2 {
    static constexpr const int A = get_value_int<a<1>, Args...>::value;

    //Coming
}

We specify directly into the class the default values (1) for a and we use the class get_value_int to get its value from the variadic type list. Here is the implementation:

template<typename D, typename... Args>
struct get_value_int;

template<typename D>
struct get_value_int<D> : std::integral_constant<int, D::value> {};

template<typename D, typename T2, typename... Args>
struct get_value_int<D, T2, Args...> {
    template<typename D2, typename T22, typename Enable = void>
    struct impl
        : std::integral_constant<int, get_value_int<D, Args...>::value> {};

    template<typename D2, typename T22>
    struct impl <D2, T22, std::enable_if_t<std::is_same<typename D2::type_id, typename T22::type_id>::value>>
        : std::integral_constant<int, T22::value> {};

    static constexpr const int value = impl<D, T2>::value;
};

If you are not really familiar with Template Metaprogramming (TMP), this may seems very unfamiliar or even barbaric, but I'll try to explain into details what is going on here :)

get_value_int is a template that takes a type D, representing the parameter we want to extract and its default, and the list of args. It has a first partial specialization for the case when Args is empty. In which case, its value is simply the value inside D (the default value). The second partial specialization handles the case when there are at least one type (T2) inside the list of args. This separation in two partial specialization is the standard way to works with variadic template parameters. This specialization is more complicated than the first one since it uses an inner class to get the value out of the list. The inner class (impl) takes the parameter type (D2), the type that is present in the list (T22) and a special parameter (Enable) that is used for SFINAE. If you're not familiar with SFINAE (you're probably not reading this article...), it is, put simply, a mean to activate or deactivate a template class or function based on its template parameters. Here, the partial specialization of impl is enabled if T22 and D2 have the same type_id, in which case, the value of T22 is taken as the return of impl. In the basic case, template recursion is used to continue iterating over the list of types. The fact that this has to be done into two template classes is because we cannot add a new template parameter to a partial template specialization even without a name. We cannot either add a simple Enable parameter to get_value_int, we cannot put before Args since then it would be necessary to give it a value in the code that uses it which is not practical neither a good practice.

We can now do the same for b that is of type char. Here is the parameter definition for b:

struct a_id;

template<int value>
struct a : std::integral_constant<int, value> {
    using type_id = a_id;
};

This code is highly similar to the code for a, so we can generalize a bit this with a base class:

struct a_id;
struct b_id;

template<typename ID, typename T, T value>
struct value_conf_t : std::integral_constant<T, value> {
    using type_id = ID;
};

template<int value>
struct a : value_conf_t<a_id, int, value> {};

template<char value>
struct b : value_conf_t<b_id, char, value> {};

This make the next parameters easier to describe and avoids small mistakes.

Making get_value_char could be achieved by replacing each int by char but this would create a lot of duplicated code. So instead of writing get_value_char, we will replace get_value_int with a generic get_value that is able to extract any integral value type:

template<typename D, typename... Args>
struct get_value;

template<typename D, typename T2, typename... Args>
struct get_value<D, T2, Args...> {
    template<typename D2, typename T22, typename Enable = void>
    struct impl
        : std::integral_constant<decltype(D::value), get_value<D, Args...>::value> {};

    template<typename D2, typename T22>
    struct impl <D2, T22, std::enable_if_t<std::is_same<typename D2::type_id, typename T22::type_id>::value>>
        : std::integral_constant<decltype(D::value), T22::value> {};

    static constexpr const auto value = impl<D, T2>::value;
};

template<typename D>
struct get_value<D> : std::integral_constant<decltype(D::value), D::value> {};

This code is almost the same as get_value_int except that the return type is deduced from the value of the parameters. I used decltype and auto to automatically gets the correct types for the values. This is the only thing that changed.

With that we are ready the parameter c as well:

template<typename... Args>
struct configurable_v2 {
    static constexpr const auto A = get_value<a<1>, Args...>::value;
    static constexpr const auto B = get_value<b<'b'>, Args...>::value;
    static constexpr const auto C = get_value<c<type::BBB>, Args...>::value;

    //Coming
};

Extracting boolean flags

The parameter d is a bit different since it is a boolean flag that puts directly the value to true. We could simply make a integral boolean value (and this would work), but here I needed a boolean flag for activating a feature deactivated by default.

Defining the parameter is easy:

template<typename ID>
struct basic_conf_t {
    using type_id = ID;
};

struct d_id;
struct d : basic_conf_t<d_id> {};

It is similar to the other parameters, except that it has no value. You'll see later in this article why type_id is necessary here.

To check if the flag is present, we'll write the is_present template:

template<typename T1, typename... Args>
struct is_present;

template<typename T1, typename T2, typename... Args>
struct is_present<T1, T2, Args...> : std::integral_constant<bool, std::is_same<T1, T2>::value || is_present<T1, Args...>::value> {};

template<typename T1>
struct is_present<T1> : std::false_type {};

This time, the template is much easier. We simply need to iterate through all the types from the variadic template parameter and test if the type is present somewhere. Again, you can see that we used two partial template specialization to handle the different cases.

With this we can now get the value for D:

template<typename... Args>
struct configurable_v2 {
    static constexpr const auto A = get_value<a<1>, Args...>::value;
    static constexpr const auto B = get_value<b<'b'>, Args...>::value;
    static constexpr const auto C = get_value<c<type::BBB>, Args...>::value;
    static constexpr const auto D = is_present<d, Args...>::value;

    //Coming
};

Extracting types

The next parameter does not hold a value, but a type. It won't be an integral constant, but it will define a typedef value with the configured type:

template<typename ID, typename T>
struct type_conf_t {
    using type_id = ID;
    using value = T;
};

template<typename T>
struct e : type_conf_t<e_id, T> {};

You may think that the extracting will be very different, but in fact it very similar. And here it is:

template<typename D, typename... Args>
struct get_type;

template<typename D, typename T2, typename... Args>
struct get_type<D, T2, Args...> {
    template<typename D2, typename T22, typename Enable = void>
    struct impl {
        using value = typename get_type<D, Args...>::value;
    };

    template<typename D2, typename T22>
    struct impl <D2, T22, std::enable_if_t<std::is_same<typename D2::type_id, typename T22::type_id>::value>> {
        using value = typename T22::value;
    };

    using value = typename impl<D, T2>::value;
};

template<typename D>
struct get_type<D> {
    using value = typename D::value;
};

Every integral constant has been replaced with alias declaration (with using) and we need to use the typename disambiguator in from of X::value, but that's it :) We could probably have created an integral_type struct to simplify it a bit further, but I don't think that would change a lot. The code of the class follows the same changes:

template<typename... Args>
struct configurable_v2 {
    static constexpr const auto A = get_value<a<1>, Args...>::value;
    static constexpr const auto B = get_value<b<'b'>, Args...>::value;
    static constexpr const auto C = get_value<c<type::BBB>, Args...>::value;
    static constexpr const auto D = is_present<d, Args...>::value;

    using E = typename get_type<e<watcher_1>, Args...>::value;

    //Coming
};

Extracting template types

The last parameter is not a type but a template, so there are some slight changes necessary to extract them. First, let's take a look at the parameter definition:

template<typename ID, template<typename> class T>
struct template_type_conf_t {
    using type_id = ID;

    template<typename C>
    using value = T<C>;
};

template<template<typename> class T>
struct f : template_type_conf_t<f_id, T> {};

Here instead of taking a simple type, we take a type template with one template parameter. This design has a great limitations. It won't be possible to use it for template that takes more than one template parameter. You have to create an extract template for each possible combination that you want to handle. In my case, I only had the case of a template with one template parameter, but if you have several combination, you'll have to write more code. It is quite simple code, since the adaptations are minor, but it is still tedious. Here is the get_template_type template:

template<typename D, typename... Args>
struct get_template_type;

template<typename D, typename T2, typename... Args>
struct get_template_type<D, T2, Args...> {
    template<typename D2, typename T22, typename Enable = void>
    struct impl {
        template<typename C>
        using value = typename get_template_type<D, Args...>::template value<C>;
    };

    template<typename D2, typename T22>
    struct impl <D2, T22, std::enable_if_t<std::is_same<typename D2::type_id, typename T22::type_id>::value>> {
        template<typename C>
        using value = typename T22::template value<C>;
    };

    template<typename C>
    using value = typename impl<D, T2>::template value<C>;
};

template<typename D>
struct get_template_type<D> {
    template<typename C>
    using value = typename D::template value<C>;
};

Again, there are only few changes. Every previous alias declaration is now a template alias declaration and we have to use template disambiguator in front of value. We now have the final piece to write the configurable_v2 class:

template<typename... Args>
struct configurable_v2 {
    static constexpr const auto A = get_value<a<1>, Args...>::value;
    static constexpr const auto B = get_value<b<'b'>, Args...>::value;
    static constexpr const auto C = get_value<c<type::BBB>, Args...>::value;
    static constexpr const auto D = is_present<d, Args...>::value;

    using E = typename get_type<e<watcher_1>, Args...>::value;

    template<typename C>
    using F = typename get_template_type<f<trainer_1>, Args...>::template value<C>;
};

Validating parameter rules

If you have more parameters and several classes that are configured in this manner, the user may use a wrong parameter in the list. In that case, nothing will happen, the parameter will simply be ignored. Sometimes, this behavior is acceptable, but sometimes it is better to make the code invalid. That's what we are going to do here by specifying a list of valid parameters and using static_assert to ensure this condition.

Here is the assertion:

template<typename... Args>
struct configurable_v2 {
    static constexpr const auto A = get_value<a<1>, Args...>::value;
    static constexpr const auto B = get_value<b<'b'>, Args...>::value;
    static constexpr const auto C = get_value<c<type::BBB>, Args...>::value;
    static constexpr const auto D = is_present<d, Args...>::value;

    using E = typename get_type<e<watcher_1>, Args...>::value;

    template<typename C>
    using F = typename get_template_type<f<trainer_1>, Args...>::template value<C>;

    static_assert(
        is_valid<tmp_list<a_id, b_id, c_id, d_id, e_id, f_id>, Args...>::value,
        "Invalid parameters type");

    //Something useful
};

Since the is_valid traits needs two variadic list of parameters, we have to encapsulate list of valid types in another structure (tmp_list) to separate the two sets. Here is the implementation of the validation:

template<typename... Valid>
struct tmp_list {
    template<typename T>
    struct contains : std::integral_constant<bool, is_present<typename T::type_id, Valid...>::value> {};
};

template<typename L, typename... Args>
struct is_valid;

template<typename L, typename T1, typename... Args>
struct is_valid <L, T1, Args...> : std::integral_constant<bool, L::template contains<T1>::value && is_valid<L, Args...>::value> {};

template<typename L>
struct is_valid <L> : std::true_type {};

The struct tmp_list has a single inner class (contains) that test if a given type is present in the list. For this, we reuse the is_present template that we created when extracting boolean flag. The is_valid template simply test that each parameter is present in the tmp_list.

Validation could also be made so that no parameters could be present twice, but I will put that aside for now.

Conclusion

Here it is :)

We now have a set of template that allow us to configure a class at compile-time with named, optional, template parameters, with default and in any order. I personally thinks that this is a great way to configure a class at compile-time and it is also another proof of the power of C++. If you think that the code is complicated, don't forget that this is only the library code, the client code on contrary is at least as clear as the original version and even has several advantages.

I hope that this article interested you and that you learned something.

The code of this article is available on Github: https://github.com/wichtounet/articles/blob/master/src/named_template_par/configurable.cppIt has been tested on Clang 3.5 and GCC 4.9.1.

Comments

SonarQube inspections for C++ projects
   Posted:


Back in the days, when I used to develop in Java (I hadn't discovered the wonders of C++ :) ), I used Sonar a lot for my projects. Sonar is a great tool for quality inspections of a project. Sonar has been made for Java and is mostly free and opensource (some plugins are commercial) to inspect Java projects. Unfortunately, this is not the case for C++ inspection. Indeed, the C++ plugin cost 7000 euros (more than 8500$). As I mostly work on C++ for opensource and school projects, I'm definitely not able to buy it. I wanted for a long time to test the commercial C++ plugin. For this article, sonarsource provided me with a short (very short) time license for the C++ plugin.

There is also another option for C++ which is the C++ community plugin: https://github.com/wenns/sonar-cxx. I have tested it some time ago, but I was not satisfied with it, I had several errors and had to use a dev version to make it work a bit. Moreover, the C++11 support is inexistant and management of parsing error is not really satisfying. But maybe it is good for you. This article will only focus on the commercial plugin.

Usage

For each project that you want to analyze with Sonar, you have to create a sonar-project.properties files describing some basic information about your project.

Then, there are two ways to inspect a C++ project. The first one and recommended one is to use the build-wrapper executable. It is a sub project that you have to download and install alongside Sonar. It works by wrapping the commands to your build systems:

build-wrapper make all

and this should generate enough informations for not having to fill each field in the project configuration. The, you have to use the sonar-runner program to upload to Sonar.

I tried it on several projects and there seems to be a problem with the includes. It didn't include the header files in the Sonar inspections.

I finally ended up using manual configuration of the Sonar project and the header files were included correctly. However, you normally have to include many information in the configuration including all macros for instance. For now, I haven't bothered generating them and it doesn't seem to impact too much the results.

When I look in the log, it seems that there are still a lot of parsing errors. They seem mostly related to some compiler macro, especially the __has_feature__ macro of clang. This is the same problem with the build-wrapper. When I don't use the build-wrapper I also have other problems with macros for unit testing.

I also have other errors coming during the inspection, for instance:

error directive: This file requires compiler and library support for the ISO C++
2011 standard. This support is currently experimental, and must be enabled with
the -std=c++11 or -std=gnu++11 compiler options

I think it comes from the fact that I compile with std=c++1y and that Sonar does not support C++14.

Inspections

Here is the results of inspection on my ETL project:

/images/etl_dashboard.png

I really like the web interface of Sonar, it really sums well all the information and the various plugins play quite nice with each other. Moreover, when you check issues, you can see directly the source code very clearly. I really think this is the strong point of Sonar.

Here is the Hotspots view for instance:

/images/etl_dashboard.png

Or the Time Machine view:

/images/etl_dashboard.png

The issues that are reported by Sonar are quite good. On this project there is a lot of them related to naming conventions because I don't follow the conventions configured by default. However, you can easily configure the inspections to give your own naming regex or simple enable/disable some inspections.

There are some good inspections:

  • Some missing explicit keyword
  • Some commented block of code that can be removed
  • An if-elseif construct that should have had a else
  • Files with too high complexity

However, there are also some important false positives. For instance:

/images/etl_false_positive_1.png

In here, there are no reasons to output this issue since the operator is deleted. It proves that the C++11 support is rather incomplete. I have other false positives of the same kind for = default operators and constructors. Here is another example:

/images/etl_false_positive_2.png

In this case, the varadic template support is mixed with the old ellipsis notation, making it again a lack of C++11 support. There are also other false positives for instance because of lambdas, but all of them were related to C++11.

Various

If you don't think you have enough quality rules, you can also include the one from cppcheck simply by givin the path to cppcheck in sonar-project.properties. I think this is great, since it works all by itself. You can also create your own rule, but you'll have to use XPath for path.

If you want, you can also include unit test reports inside Sonar. I haven't tested this support since they only support cppunit test reports and I use only Catch for my unit tests. It would have been great if JUnit format would have been supported since many tool support it.

The last option that is supported by this plugin is the support of GCOV reports for code coverage information. I haven't been able to make it work, I had errors indicating that the source files were not found. I didn't figure this out. It may come from the fact that I used llvm and clang to generate the GCOV reports and not G++.

Conclusion

First, here are some pros and cons for the C++ support in SonarQube.

Pros

  • Good default inspections
  • Great web interface.
  • cppcheck very well integrated
  • Issues are easily configurable

Cons

  • C++11 support is incomplete and no C++14 support
  • build-wrapper support seems instable. It should be integrated directly into sonar.
  • Unit tests support is limited to cppunit
  • Haven't been able to make Code Coverage work
  • Macro support not flexible enough
  • Too expensive
  • Quite complicated
  • No support for other static analyzer than cppcheck

The general web interface feeling is quite good, everything looks great and the report are really useful. However, the usage of the tool does not feel very professional. I had a lot more problems than I expected to use it. I was also really disappointed by the C++11. The syntax seems to be supported but not the language feature in the inspections, making the C++11 support completely useless. This is weird since they cite C+11 as supported. Moreover, there not yet any C++14 support, but this is less dramatic. It is also a bit sad that they limit the import to cppcheck and no other static analyzers and the same stands for cppunit.

In my opinion, it is really an inferior product compared to the Java support. I was expecting more from a 8500 dollars product.

For now, I won't probably use it anymore on my projects since all of them use at least C++11, but I will probably retry Sonar for C++ in the future hoping that it will become as the Sonar Java support.

Comments

Linux tip: Force systemd networkd to wait for DHCP
   Posted:


Recently, I started using systemd-networkd to manage my network. It works really good for static address configuration, but I experienced some problem with DHCP. There is DHCP client support integrated into systemd, so I wanted to use this instead of using another DHCP client.

(If you are not familiar with systemd-networkd, you can have a look at the last section of this article)

The problem with that is that services are not waiting for DHCP leases to be obtained. Most services (sshd for instance), are waiting for network.target, however, network.target does not wait for the DHCP lease to be obtained from the server. If you configured ssh on a specific IP and this IP is obtained with DHCP, it will fail at startup. The same is true for NFS mounts for instance.

Force services to wait for the network to be configured

The solution is to make services like sshd waits for network-online.target instead of network.target. There is a simple way in systemd to override default service files. For a X.service, systemd will also parse all the /etc/systemd/X.service.d/*.conf files.

For instance, to make sshd be started only after DHCP is finished

/etc/systemd/systemd/sshd.service.d/network.conf:

[Unit]
Wants=network-online.target
After=network-online.target

However, by default, network-online.target does not wait for anything. You'll have to enable another service to make it work:

systemctl enable systemd-networkd-wait-online

And another note, at least on Gentoo, I had to use systemd-216 for it to work:

emerge -a "=sys-apps/systemd-216"

And after this, it worked like a charm at startup.

Force NFS mounts to wait for the network

There is no service file for nfs mounts, but there is a target remote-fs.target that groups the remote file systems mounts. You can override its configuration in the same as a service:

/etc/systemd/systemd/remote-fs.target.d/network.conf:

[Unit]
Wants=network-online.target
After=network-online.target

Conclusion

Here we are, I hope this tip will be useful to some of you ;)

Appendix. Configure interface with DHCP with systemd

To configure an interface with DHCP, you have to create a .network file in /etc/systemd/network/. For instance, here is my /etc/systemd/network/local.network file:

[Match]
Name=enp3s0

[Network]
DHCP=v4

and you have to enable systemd-networkd:

systemctl enable systemd-networkd
Comments

budgetwarrior 0.4.1 - Expense templates and year projection
   Posted:


I've been able to finish the version 0.4.1 of budgetwarrior before I though :)

Expense templates

The "most useful" new feature of this release is the ability to create template for expenses.

For that, you can give an extra parameter to budget expense add:

budget expense add template name

This will works exactly the same as creating a new expense expect that it will be saved as a template. Then, the next time you do:

budget expense add template name

A new expense will be created with the date of the day and with the name and amount saved into the template. You can create as many templates as you want as long as they have different names. You can see all the templates you have by using 'budget expense template'. A template can be deleted the exact same as an expense with 'budget expense delete id'.

I think this is very useful for expense that are made several times a month, for instance a coffee at your workplace. The price should not change a lot and it is faster to just use the template name rather than entering all the information again.

Year prediction

You can now see what would be next year if you changed a bit your expenses. For instance, how much would you still have at the end of the year if you increased your house expenses by 20% and reduced your insurances by 5% ?

The 'budget predict' can be used for that purpose. You can enter a multiplier for each account in your budget and a new year will be "predicted" based on the expenses of the current year multiplied by the specified multiplier:

/images/budget_041_prediction.png

I think that this feature can be very useful if you want to estimate how your budget will be for moving to a more expensive house or another insurance for instance.

Various changes

Two accounts can be merged together with the 'budget account migrate' command. This command will move all expenses from an account to another and adapt the amount of the target account. The source account will be deleted. This supports migrated accounts.

The 'budget wish list' command will now display the mean accuracy of your predictions.

You don't need Boost anymore for this project. The only remaining dependency is libuuid. I will perhaps remove it in the next version since the UUID are not used in the application for now.

The command 'budget gc' will clean the IDs of all your data in order to fill the holes and make all the IDs contiguous. It is mostly a feature for order-freaks like me who do not like to have holes in a sequence of identifiers ;)

There was a bug in the monthly report causing the scale to be displayed completely moved, it is now fixed:

https://raw.githubusercontent.com/wichtounet/budgetwarrior/develop/screenshots/budget_report.png

Installation

If you are on Gentoo, you can install it using layman:

layman -a wichtounet
emerge -a budgetwarrior

If you are on Arch Linux, you can use this AUR repository.

For other systems, you'll have to install from sources:

git clone git://github.com/wichtounet/budgetwarrior.git
cd budgetwarrior
make
sudo make install

Conclusion

If you are interested by the sources, you can download them on Github: budgetwarrior.

If you have a suggestion for a new features or you found a bug, please post an issue on Github, I'd be glad to help you.

If you have any comment, don't hesitate to contact me, either by letting a comment on this post or by email.

Comments

A Mutt journey: Search mails with notmuch
   Posted:


In the previous installment in the Mutt series, I've talked about my Mutt configuration. In this post, I'll talk about notmuch and how to use it to search through mails.

By default, you can search mails in Mutt by using the / key. By doing that, you can only search in the current folder. This is very fast, but this is not always what you want. When you don't know in which folder the mail you are looking for is, you don't want to test each folder. By default, there are no feature to achieve global searching in Mutt.

That is where notmuch comes to the rescue. notmuch is a very simple tool that allows you to search through your mail. As its name indicates, it does not do much. It doesn't download your mails, you have to have them locally, which is perfect if you use offlineimap. It does not provide a user interface, but you can query it from the command line and it can be used from other tools. It should be available in most of the distributions.

Configuration

The configuration of notmuch is fairly simple. You can write your .notmuch-config directly or run notmuch setup that will interactively help you to fill the configuration.

Here is my configuration:

[database]
path=/data/oi/Gmail/

[user]
name=Baptiste Wicht
primary_email=baptiste.wicht@gmail.com

[new]
tags=inbox
ignore=

[search]
exclude_tags=deleted;

[maildir]
synchronize_flags=true

It needs of cours the place where your mails are stored. Then, some information about you. The next section is to specify which tags you want to add to new mails. Here, I specified that each new mail must be tagged with inbox. You can add several tags to new mails. In the [search] section, the excluded tags are specified.

Usage

Once you have configured notmuch, you can run notmuch new to process all existing mails. The first run may take some time (in minutes, it is still quite fast), but the subsequent runs will be very fast. You should run notmuch after each offlineimap run. I personally run it in a shell script that is run by cron. You could also use one of the hooks of offlineimap to run notmuch.

Once indexing has been done, you can start searching your mails. The first option to search mail is simply to use notmuch search <query> from the command line. This will directly displays the results. Search is instant on my mails.

If you use mutt-kz like me, notmuch support is directly integrated. You can type X, and then type your query like notmuch://?query=X and the results will be displayed as a normal Mutt folder. You can open mails directly from here and you can also edit the mails as if you were in their source folders. This is really practical.

If you use mutt, you can have the same experience, by using the notmuch-muttpatch (here <http://notmuchmail.org/notmuch-mutt/>). In several distributions, there is an option to build it with this support or another package to add the feature.

Another feature of notmuch is its ability to tag mails. It automatically tags new mails and deleted mails. But you can also explicitely tag messages by using notmuch tag. For instance, to tag all messages from the notmuch mailing list:

notmuch tag +notmuch -- tag:new and to:notmuch@notmuchmail.org

I personally don't use this feature since I use imapfilter and IMAP folders to sort my mail, but it can be very useful. You can run these commands in the cronjob and always have you tags up to date. Tags can then be used in notmuch to search or to create virtual folder in Mutt.

Conclusion

That is already more or less everything that there is to know about notmuch. It does not do a lot of thing, but it does them really well.

That concludes the series of posts on Mutt. If you have any question on my Mutt configuration, I'd be glad to extend on the comments.

Comments

Catch: A powerful yet simple C++ test framework
   Posted:


Recently, I came accross a new test framework for C++ program: Catch.

Until I found Catch, I was using Boost Test Framework. It is working quite well, but the problem is that you need to build Boost and link to the Boost Test Framework, which is not highly convenient. I wanter something lighter and easier to integrate.

Catch is header only, you only have to include one header for each test file. Moreover, it is very easy to combine several source files without linking problems.

Usage

The usage is really simple. Here is a basic example:

#define CATCH_CONFIG_MAIN
#include "catch.hpp"

TEST_CASE( "stupid/1=2", "Prove that one equals 2" ){
    int one = 1;
    REQUIRE( one == 2 );
}

The define is made to ensure that Catch will generate a main for you. This should only defined in one of your tests files if you have several. You define a new test case using the TEST_CASE macro. There are two parameters, the first one is the name of the test case, you can use any name, you don't have to use a valid C++ name. The second parameter is a longer description of the test case.

You then use REQUIRE to verify a condition. You can also use CHECK to verify a condition, the difference being that it does not stop if the condition is not true. CHECK is a good tool to put together some conditions that are related. There also exists REQUIRE_FALSE and CHECK_FALSE versions.

As you can see, there are no REQUIRE_EQUALS or things like that, you can use any comparison operator you want in the REQUIRE.

This produces an executable that will, by default, run every test contained in the executable. You can also configure the output report to be XML or JUnit if you want or run a subset of your tests. Take a look at the command line usage by running the executable with the -h option if you want more information.

Here is the result of the previous test:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
catch_test_1 is a Catch v1.0 b52 host application.
Run with -? for options

-------------------------------------------------------------------------------
stupid/1=2
-------------------------------------------------------------------------------
src/catch/test1.cpp:4
...............................................................................

src/catch/test1.cpp:6: FAILED:
  REQUIRE( one == 2 )
with expansion:
  1 == 2

===============================================================================
test cases: 1 | 1 failed
assertions: 1 | 1 failed

For each failed condition, the source location is printed as well as some information on the test that failed. What is also interesting is the "with expansion" information that shows LHS and RHS of the comparison operator.

You can also check for exceptions with several macros:

  • REQUIRE_THROWS(expression) and CHECK_THROWS(expression)verify that an exception is thrown when the given expresssion is evaluated.
  • REQUIRE_THROWS_AS(expression, exception_type) and CHECK_THROWS_AS(expression, exception_type) verify the the given exception is thrown.
  • REQUIRE_NOTHROW(expression) and CHECK_NOTHROW(expression)verify that no exception is thrown.

Conclusion

I have only covered the most basic features, there is more that you can do with Catch: fixtures, logging and BDD-style test cases for instance. For more information you can read the reference documentation.

I'm really satisfied with this framework. It also can be used for Objective-C if you are interested. You can download Catch on Github.

If you want more examples, you can take a look at the ETL tests that are all made with Catch.

Comments

ETL - C++ library for vector and matrix computations
   Posted:


When working on Machine Learning algorithms, I was in need of a simple library to ease working with vectors and matrix. This is the reason why I started developing ETL (Expression Template Library).

ETL is a small header only library for C++ that provides vector and matrix classes with support for Expression Templates to perform very efficient operations on them.

The library supports statically sized and dynamically sized vector and matrix structures with efficient element-wise operations. All the operations are implemented lazily with Expression Templates, they are only implemented once the expression is assigned to a concrete structure.

Data structures

Several structures are available:

  • fast_vector<T, Rows>: A vector of size Rows with elements of type T. This must be used when you know the size of the vector at compile-time.
  • dyn_vector<T>: A vector with element of type T. The size of the vector can be set at runtime.
  • fast_matrix<T, Rows,Columns>: A matrix of size Rows x Columns with elements of type T. This must be used when you know the size of the matrix at compile-time.
  • dyn_matrix<T>: A matrix with element of type T. The size of the matrix can be set at runtime.

All the structures are size-invariant, once set they cannot be grown or shrinked.

In every operations that involves fast version of the structures, all the sizes are known at compile-time, this gives the compiler a lot of opportunities for optimization.

Element-wise operations

Classic element-wise operations can be done on vector and matrix as if it was done on scalars. Matrices and vectors can also be added, subtracted, divided, ... by scalars.

Here is an example of what can be done:

etl::dyn_vector<double> a({1.0,2.0,3.0});
etl::dyn_vector<double> b({3.0,2.0,1.0});

etl::dyn_vector<double> c(1.4 * (a + b) / b + b + a / 1.2);

All the operations are only executed once the expression is evaluated to construct the dyn_vector. No temporaries are involved. This is as efficient as if a single for loop was used and each element was computed directly.

You can easily assign the same value to a structure by using the operator = on it.

Unary operators

Several unary operators are available. Each operation is performed on every element of the vector or the matrix.

Available operators:

  • log
  • abs
  • sign
  • max/min
  • sigmoid
  • noise: Add standard normal noise to each element
  • logistic_noise: Add normal noise of mean zero and variance sigmoid(x) to each element
  • exp
  • softplus
  • bernoulli

Several transformations are also available:

  • hflip: Flip the vector or the matrix horizontally
  • vflip: Flip the vector or the matrix vertically
  • fflip: Flip the vector or the matrix horizontally and vertically. It is the equivalent of hflip(vflip(x))
  • dim/row/col: Return a vector representing a sub part of a matrix (a row or a col)
  • reshape: Interpret a vector as a matrix

Again, all these operations are performed lazily, they are only executed when the expression is assigned to something.

Lazy evaluation

All binary and unary operations are applied lazily, only when they are assigned to a concrete vector or matrix class.

The expression can be evaluated using the s(x) function that returns a concrete class (fast_vector,fast_matrix,dyn_vector,dyn_matrix) based on the expression.

Reduction

Several reduction functions are available:

  • sum: Return the sum of a vector or matrix
  • mean: Return the sum of a vector or matrix
  • dot: Return the dot product of two vector or matrices

Functions

The header convolution.hpp provides several convolution operations both in 1D (vector) and 2D (matrix). All the convolution are available in valid, full and same versions.

The header mutiplication.hpp provides the matrix multiplication operation (mmult). For now on, only the naive algorithm is available. I'll probably add support for Strassen algorithm in the near future.

It is possible to pass an expression rather than an data structure to functions. You have to keep in mind that expression are lazy, therefore if you pass a + b to a matrix multiplication, an addition will be run each time an element is accessed (n^3 times), therefore, it is rarely efficient.

Examples

Here are some examples of these operators (taken from my Machine Learning Library):

h_a = sigmoid(b + mmul(reshape<1, num_visible>(v_a), w, t));
h_s = bernoulli(h_a);
h_a = min(max(b + mmul(reshape<1, num_visible>(v_a), w, t), 0.0), 6.0);
h_s = ranged_noise(h_a, 6.0);
weight exp_sum = sum(exp(b + mmul(reshape<1, num_visible>(v_a), w, t)));

h_a = exp(b + mmul(reshape<1, num_visible>(v_a), w, t)) / exp_sum;

auto max = std::max_element(h_a.begin(), h_a.end());

h_s = 0.0;
h_s(std::distance(h_a.begin(), max)) = 1.0;

Conclusion

This library is available on Github: etl. It is licensed under MIT license.

It is header-only, therefore you don't have to build it. However, it uses some recent C++14 stuff, you'll need a recent version of Clang or G++ to be able to use it.

If you find an issue or have an idea to improve it, just post it on Github or as a comment here and I'll do my best to work on that. If you have any question on the usage of the library, I'd be glad to answer them.

Comments

A Mutt journey: Mutt configuration
   Posted:


If you've followed my Mutt posts, you'll know that I'm filtering my mails with imapfilter and downloading them with offlineimap.

In this post, I'll share my Mutt configuration. I'm not using Mutt directly, but mutt-kz which is a fork with good notmuch integration. For this post, it won't change anything.

Configuration

The complete configuration is made in the .muttrc file. Mutt configuration supports the source command so that you can put some of your settings in another files and source them from the .muttrc file. You'll see that the configuration can soon grow large and therefore, splitting it in several files will save you a lot of maintenance issues ;)

First, let's tell Mutt who we are:

set from = "baptiste.wicht@gmail.com"
set realname = "Baptiste Wicht"

Receive mail

As I'm using offlineimap to get my mails, there is no IMAP settings in my configuration. But you need to tell Mutt where the mails are:

set folder = /data/oi/

set spoolfile = "+Gmail/INBOX"
set postponed = "+Gmail/drafts"

source ~/.mutt/mailboxes

The spoolfile and postponed are specifying the inbox and draft mailboxes. The .mutt/mailboxes file is generated by offlineimap.

By default, Mutt will ask you to move read messages from INBOX to another mailbox (set by mbox). I personally let my read messages in my inbox and move them myself in a folder. For that, you have to disable the move:

set move = no

If you move a mail from a mailbox to another, Mutt will ask for confirmation, you can disable this confirmation:

set confirmappend = no

If you use Mutt, you want to read plaintext messages rather than monstruous HTML. You can tell Mutt to always open text plain if any:

alternative_order text/plain text/html

If the mail has no text/plain part, you can manage to read HTML in Mutt in an almost sane format. First, you need to tell Mutt to open html messages:

auto_view text/html

And then, you need to tell it how to open it. Mutt reads a mailcap file to know how to open content. You can tell Mutt where it is:

set mailcap_path = ~/.mailcap

And then, you have to edit the .mailcap file:

text/html; w3m -I %{charset} -T text/html; copiousoutput;

That will use w3m to output the message inside Mutt. It works quite well. You can also use linx if you prefer.

Send mail

You need to indicate Mutt how to send mail:

set smtp_url = "smtp://baptistewicht@smtp.gmail.com:587/"
set smtp_pass = "SECRET"

Some people prefer to use another SMTP client instead of Mutt builtin SMTP support, you can also do that by setting sendmail to the mailer program.

It is generally a good idea to enforce the charset of sent mail:

set send_charset="utf-8"

You can choose another charset if you prefer ;)

You need to configure vim to correctly handle mail editing:

set editor='vim + -c "set textwidth=72" -c "set wrap" -c "set spell spelllang=en"'

It sets the width of the text, enable wrap and configure spelling.

By default, Mutt will ask you if you want to include the body of the message you reply to in your answer and the reply subject. You can make that faster by using these two lines:

set include=yes
set fast_reply

Once mail are sent, they are copied in your outgoing mailbox. If you use GMail, the STMP server already does that for you, therefore you should disable this behavior:

set copy = no

Appearance

Many things can also be configured in the appearance of Mutt. If you like the threaded view of GMail, you want to configure Mutt in a similar way:

set sort = 'threads'
set sort_aux = 'reverse-last-date-received'

It is not as good as the GMail view, but it does the job :)

You can make reading mail more comfortable using smart wrapping:

set smart_wrap

A mail has many many headers and you don't want to see them all:

ignore *
unignore From To Reply-To Cc Bcc Subject Date Organization X-Label X-Mailer User-Agent

With that, you just configure which headers you're interested in.

If you're using the sidebar patch (and you should be ;), you can configure the sidebar:

set sidebar_visible = yes
set sidebar_width = 35
set sort_sidebar = desc

color sidebar_new yellow default

It makes the sidebar always visible with a width of 35 and sort the mailboxes. The last line makes yellow the mailboxes that have unread mails.

The index_format allows you to set what will shown for every mail in the index view:

set index_format = "%4C %Z %{%b %d} %-15.15L %?M?(#%03M)&(%4l)? %?y?{%.20y}? %?g?{%.20g} ?%s (%c)"

This is a classical example that display the sender, the flags, the date, the subject, the size of the mail and so on. You will need to look at the Reference to have more information on what you can do with the format variables. There are plenty of information that can be shown.

You can also configure the text that is present on the status bar:

set status_chars  = " *%A"
set status_format = "───[ Folder: %f ]───[%r%m messages%?n? (%n new)?%?d? (%d to delete)?%?t? (%t tagged)? ]───%>─%?p?( %p postponed )?───"

The example here displays the current folder, the number of mails in it with some details on deleted and unread mails and finally the number of postponed mail. Again, if you want more information, you can read the reference.

You can configure Mutt so that the index view is always visible when you read mails. For instance, to always show 8 mails in the index:

set pager_index_lines=8

Another important thing you can configure is the colors of Mutt. I'm not gonna cover everything, since Mutt is very powerful on this part. For instance, here are some examples from my configuration:

color index red white "~v~(~F)!~N" # collapsed thread with flagged, no unread color index yellow white "~v~(~F~N)" # collapsed thread with some unread & flagged color index_subject brightred default "~z >100K" color header blue default "^(Subject)"

Unless you are really wanting to spend time on this part, I recommend to pick an existing theme. I took a Solarized theme here. It looks quite good and works well. There other themes available, you'll surely find the one that looks best for you.

Bindings

Bindings are always very important. If like me, you're a vim aficionado, you'll want your Mutt bindings to be as close as possible to vim. The default settings are quite good, but not always close to vim.

Something that is important to know when you configure Mutt bindings is that they are relative to the current view open (index, pager,browser,attach, ...). You can bind a keystroke to a different action in each view. You can also select several views in which the keystroke is valid.

If you are using the sidebar patch (and again, you should ;) ), you'll want to configure fast bindings for it. Here are mine:

bind index,pager \Ck sidebar-prev
bind index,pager \Cj sidebar-next
bind index,pager \Cl sidebar-open
bind index,pager \Cn sidebar-scroll-up
bind index,pager \Cv sidebar-scroll-down
bind index,pager \Ct sidebar-toggle

I use Ctrl+j,k to move inside the sidebar. I use Ctrl+l to open a folder and Ctrl+n,v to scroll up and down. The last one is to toggle between multiple sidebars for instance if you use notmuch.

I find l very good to open messages in the index too:

bind index l display-message
bind index gg first-entry
bind index G last-entry
bind index h noop               # Disable h

gg and G are used to go to the first and last element. Here I disabled h which had a not often used command.

The pager is the view where you read mail:

bind pager h exit
bind pager gg top
bind pager G bottom
bind pager J next-line
bind pager K previous-line

In this view, I use h to get out of the pager, gg and G as usual. As I always let the index open, I already use j and k to move in the index, so I chose J and K to move in the pager.

The browser is the view where you select folders for instance:

bind browser l select-entry
bind browser L view-file
bind browser gg first-entry
bind browser G last-entry
bind browser h exit

Again, I use l and h to go back and forth and gg and G to go first and last. j and k are already used here to go up and down.

In the attach view:

bind attach h exit
bind attach e edit-type # Edit MIME Types
bind attach l view-attach

I use h to exit and l to view an attachment.

That is it for my bindings, but you configure a lot more of them.

Conclusion

This is the end of this post. I have covered my complete Mutt configuration here. My .muttrc is available online.

If you have comments on my configuration, you're welcome to let a comment on this post ;)

In the next blog post about my "Mutt journey", I'll talk about notmuch and this will likely be the last post on this series.

Comments