Run your Boost Tests in parallel with CMake

I was looking for a Test Library to run eddic tests in parallel to replace Boost Test Library. I posted my question on StackOverflow and an awesome solution has been posted. With CMake and a little CMake additional file, it is possible to run the tests written with Boost Test Library in parallel without changing anything in the tests code !

CTest is the test runner that is shipped with CMake. This runner can run tests in parallel using the -j X option (X is the numbers of threads). However, it can only run the tests that are declared in the CMakeLists.txt file. In my case, this means only one (the executable with Boost Test Library). If you have T tests, a solution would be create T executable files. Then, they can be run in parallel by ctest. However, this is not very practical. The solution proposed in this article is better.

Integrate Boost Test Library in CMake

Ryan Pavlik provides a series of CMake modules in its Github repository. One of this module is named BoostTestTargets. It automatically generates the CTest commands to run all the tests that you have. The small drawback is that you to list all the tests.

To start, you have to download these files:

These files must be placed next to your CMakeLists.txt file. Then, you have to modify your CMakeLists.txt file to enable testing and enable the new module. For example, if you have two test suites and five tests in each:

INCLUDE(CTest)

ENABLE_TESTING()

file(
    GLOB_RECURSE
    test_files
    test/*
)

include(BoostTestTargets.cmake)

add_boost_test(eddic_boost_test
    SOURCES ${test_files}
    TESTS 
    TestSuiteA/test_1
    TestSuiteA/test_2
    TestSuiteA/test_3
    TestSuiteA/test_4
    TestSuiteA/test_5
    TestSuiteB/test_1
    TestSuiteB/test_2
    TestSuiteB/test_3
    TestSuiteB/test_4
    TestSuiteB/test_5
    )

All the test files are searched in the test directory and used in the SOURCES variable. Then all the tests are declared.

The main test file has to include a specific header file:

#define BOOST_TEST_MODULE eddic_test_suite
#include <BoostTestTargetConfig.h>

This file will be automatically detected by BoostTestTargets and configured correctly. And that's it !

You can run CMake again in your build directory to use the new test system:

[bash]cmake .[/bash]

If the configuration has been successful, you will see a message indicating that. For example, I see that:

-- Test 'eddic_boost_test' uses the CMake-configurable form of the boost test framework - congrats! (Including File: /home/wichtounet/dev/eddi/eddic/test/IntegrationTests.cpp)
-- Configuring done
-- Generating done
-- Build files have been written to: /tmp/ramdrive/dev/eddic

Run tests in parallel

You can then run your tests in parallel with ctest. For instance, with 9 threads:

ctest -j 8

In my case, my tests are completed 6x faster ! This is very valuable when you often run your tests.

For more information on how to integrate your Boost Test Library tests with CMake, you can consult the The cmake-modules repository

Use CMake to easily compiles Latex documents into PDF

Everyone who compiles Latex documents by hand knows that it is not a panacea. You have to compile the file several times to handle the references. Moreover, if you have a glossary or an index, you have to run others commands between Latex commands so that everything is correctly resolved. The better way to handle Latex compilation is to write a MakeFile compiling each part. However, writing a Latex MakeFile by hand is not easy and especially not interesting.

Using CMake for most of my development projects, I tried to find a CMake script to generates a MakeFile easily. I did found a good script for that, but I wanted to add some features and change some things, so I forked it to Github: The CMakeLatex repository.

Usage

Here is an example using all the features of the script for one of my Latex documents.

PROJECT(master_project NONE)
cmake_minimum_required(VERSION 2.8)
SET(LATEX_OUTPUT_PATH build)
INCLUDE(UseLATEX.cmake)

file(GLOB_RECURSE contents_files RELATIVE ${CMAKE_SOURCE_DIR} contents/*.tex)

ADD_LATEX_DOCUMENT(
    master.tex
    INPUTS ${contents_files}
    IMAGE_DIRS images
    BIBFILES bibliography.bib
    USE_INDEX
    USE_GLOSSARY
    FILTER_OUTPUT
    )

To use it, you have to download the files of the repository and put them aside your Latex files (or just make symlinks to the files in a clone of the repository for easy update). Then, the UseLATEX.cmake file has to be included in your CMakeLists.txt file.

I think that it is a good practice to generates the Latex files in another directory. This directory can be set using the LATEX_OUTPUT_PATH variable.

Then, to add a latex document, you can use the ADD_LATEX_DOCUMENT function. The first parameter is the name of the main Latex file. After that, you have to give several parameters:

  • INPUTS: It needs the list of Latex files that are included in master file. I use the GLOB_RECURSE function to find all of them in a contents subfolder.
  • IMAGE_DIRS: The directory where the image are stored. They will be copied to the build folder and automatically converted if necessary.
  • BIBFILES: If you have a bibliography, you just have to list all the .bib files of your project.
  • USE_INDEX: Necessary only if your document use an index.
  • USE_GLOSSARY: Necessary only if your document use a glossary.
  • FILTER_OUTPUT: This option activates the filtering of pdflatex output to the console. For now, the option is quite limited, but it allows you to have a smoother output. It has to be taken into account that this option hides the overflow and underflow warnings.
  • CONFIGURE: You can use the CMake configuration feature on some of your files if you want CMake variables to be replaced in the documents.

Once your Latex document is configured, you can just run cmake on your project. After that, you can use targets to generate pdf:

  • make pdf: This will generate the Latex file using several passes and running all the necessary commands.
  • make fast: This will generate a pdf in only one pass. This can be useful if you want to see a rough draft of your document quickly.

I already use this script for several of my documents. I hope that it will be useful for some of you. If you want any problem in the script or in the generate make file or if you have an idea for improvement, don't hesitate to let a command or to publish an Issue or a Pull Request in the CMakeLatex repository.

This script only support pdflatex and can only generates pdf directly. If you want latex support with dvi/ps/pdf generation, you should take a look at the original project: CMakeUserUseLATEX

Linux symbolic links (soft) and hard links

On Linux, it is possible to create links to existing file. These links can be either symbolic links or hard links. Each of them having advantages and drawbacks. In this small post, we will see the differences between the two kinds of links and how to use them.

Hard Link

An hard link refers directly to the physical location of another file (an inode to be precise).

A hard link has some limitations: it cannot refer to a directory and cannot cross file system boundaries. It means that you can only create hard links to the same file system where the hard link is located.

When the source of the link is moved or removed, the hard link still refer to the source.

Symbolic link are created with the ln command. For instance, to create a link to source_file:

ln source_file link

Symbolic Link

A symbolic link refers to a symbolic path indicating the location of the source file. You can see it as a link to a path (itself refering to an inode).

A symbolic link is less limited. It can refer to a directory and can cross file system boundaries.

However, when the source of the link is moved or removed, the symbolic link is not updated.

Symbolic link are created with the ln command. For instance, to create a symbolic link to source_file:

ln -s source_file link

Deletion

The deletion of a link (hard or symbolic) can be achieved with the rm or unlink commands:

rm link
unlink link

Conclusion

And that's it!

Symbolic and hard links are very useful tools and are very easy to use.

I hope that this blog post helped you understand a little better the differences between the two types of links and how to use them.

Packt Publishing celebrates its 1000th IT Book !

Packt Publishing is about to publish its 1000th title, on the 30th of September, 2012.

Packt published their first book in April 2004. They now have a lot of books on about every subject from web development to IT architecture, games to e-commerce. Their books are known for their high quality.

For this occasion, they are offering a surprise gift to all their members. If you want to be part of it, you just have to sign up for a free Packt Publishing account. If you're already registered, you don't have anything to do! You need to be registered before the 30th of September in order to get involved.

Packt is also known for their support to Open Source. They support Open Source projects through a project royalty donation. They already have contributed over £300,000. For this special occasion, they will allocate 30,000 to share between projects and authors in their own way that will be disclosed on the website soon.

For more information about Packt Publishing, their books or how to sign-up for a free account for this offer, you can view the official website: http://www.packtpub.com/

Back in Berkeley, California

I arrived yesterday to Berkeley, California.

Just like I did my Bachelor thesis in Lawrence Berkeley National Laboratory (LBNL), I will do my Master Thesis there too. The thesis will last a bit less than a semester.

During my Master Thesis I will try to use profiling samples from the Linux perf tools in GCC or Clang to optimize processor cache usage (avoid cache and page faults).

I will try to publish some posts about that during the semester if I have time.

EDDI Compiler 1.1.3 - Templates

I finished the version 1.1.3 of the EDDI Compiler (eddic).

The main improvement to the language is the support of templates. The syntax is more or less the same as the syntax of C++ templates, but the features are much more limited. In EDDI, you can declare class templates and function templates. Class templates can also includes member function templates.

Here is an example of the use of templates in EDDI:

template<type T>
struct node {
    T value;

    this(T init){
        print("C1|");
        this.value = init;
    }

    T get_value(){
        return this.value;
    }

    template<type U>
    void print_value(U v){
        print(v);
        print("|");
    }
}

template<type T>
void debug(T t){
    print(t);
    print("|");
}

template<type T>
void test(node<T>* node){
    debug<T>(node.value);
    debug<T>(node.get_value());
}

void main(){
    node<int> first_node(100);
    node<float> second_node(13.3);

    test<int>(first_node);
    test<float>(second_node);

    first_node.print_value<float>(1.0);
    second_node.print_value<int>(10);
}

This new feature adds generic programming capabilities to the language.

This version also adds other language improvements. The first one is the support of the ! operator for a bool, to test if a bool is false. This version also includes support for iterating through all the chars of a string with a foreach loop. And finally, the this pointer is now implicit to access member fields of a struct from member functions.

The optimization engine has been greatly improved. The pointers are much better handled and some regression due to new features have been fixed. The Constant Propagation optimization can take default values of struct and arrays into account. Finally, the functions with char parameters can now be inlined.

Finally, the compiler use a new logging system, that can be completely removed at compile-time for release versions.

Future Work

The next version of the EDDI compiler (eddic) will be the version 1.1.4. This version will add support for some basic pointer manipulation. It will also add support for dynamically allocated arrays. Finally, the version will includes several new optimization techniques regarding to loops: Loop Invariant Code Motion, Loop Strength Reduction and perhaps some basic Loop Unrolling.

Download

You can find the EDDI Compiler sources on the Github repository: https://github.com/wichtounet/eddic

The exact version I refer to is the v1.1.3 available in the GitHub tags or directly as the release branch.

Jelastic Java Host - Recommended by James Gosling !

I recently came across an interesting tool. Jelastic is a Platform as a Service (PaaS) provider for Java. Basically, it's a cloud for Java applications.

The most interesting point about Jelastic (in my opinion) is the fact that it can run any Java application. There are no API to use or special change that have to be made: you can take any Java app that you have and run it on Jelastic. Jelastic runs Glassfish, Tomcat and Jetty application servers. It's up to the developer to choose the application server. Because it's only made for Java, you have directly access to the application server where you can deploy to, you don't have access to the machine itself.

Another great advantage of Jelastic is that it automatically scales vertically. At the beginning, you application is only allowed a very small amount of CPU and memory and when the system detects that it needs more, it automatically gives more resources to the application. And when the application has too much resources, there are released. That has the advantage that you don't need to worry about the resources of your application and that the costs are to the minimum when the application doesn't need a lot of resources. Of course, you can also put limits on the scalability. An application can also be run in several different application servers (horizontal scaling). It supports automatic load balancing for the different instances.

A Jelastic environment provides also access to a database server of your choice (MySQL, MariaDB, PostgreSQL, MongoDB, CouchDB). It also has several other good features. You can look at the official list if you want a complete list of features.

The official Jelastic site provides several very good guides about how to deploy a specific type of application to Jelastic. For example, there are guides for Play! Framework, Clojure or Alfresco.

The interesting point about Jelastic is that it has been recommended by James Gosling itself (the father of Java):

I really like Jelastic. It’s actually software package that a number of ISPs are using. It’s a Java hosting system and so you don’t get a bare Linux machine. What you get is a JavaEE container, and you can drop WAR files on them and they have this really nice control panel where you get a slider that says how many clones of Glassfish do you want and check boxes for [databases]. You don’t have to go into Linux – Oh my God, what it takes to install anything: it’s like which version of Linux is compatible with which app server and what time… they actually take care of that and it works lovely. I actually built these clusters and they can span multiple ISPs, multiple countries, multiple datacenters, and that’s how I deal with my personal extreme paranoia of the survivability of these things.

James is working in a small startup, Liquid Robotics that handles a set of automatic robots in the ocean.

I think that all these information are making of Jelastic a very good choice for a Java host !

More information

EDDI Compiler 1.1.2 – Read command line

I finished the eddi compiler (eddic) version 1.1.2. It took me a long time because of a lot of other things I had to do this month.

This version includes two major changes. First, this version adds a new type for characters (char). A char can be declared using a char literal ('b' for instance) or from an int ( (char) 77). This version introduces the [] operator for string to have access to a specific char.

Another major improvement is the support for reading the command line. For now, only characters can be read, one by one with the read_char function.

The standard library includes a new function to compare two strings (str_equals):

bool str_equals(string a, string b){
    if(length(a) != length(b)){
        return false;
    }

    for(int i = 0; i &lt; length(a); ++i){
        if(a[i] != b[i]){
            return false;
        }
    }

    return true;
}

The other improvements are not relative to the language. The inlining engine can now inline functions that takes arrays as parameters. The symbol table is now represented by the global context. There is no global symbol table. This new version includes several improvements of the code and a cleanup of the AST to remove redundancy.

Future Work

The next version of the eddi compiler (eddic) will be the version 1.1.3. This version will introduce support for a very basic version of template engine. It will also add support for foreach on string. This version will also add new features and cleanup in the different optimizations passes.

Download

You can find the EDDI Compiler sources on the Github repository: https://github.com/wichtounet/eddic

The exact version I refer to is the v1.1.2 available in the GitHub tags or directly as the release branch.

Algorithms books Reviews

To be sure to be well prepared for an interview, I decided to read several Algorithms book. I also chosen books in order to have information about data structures. I chose these books to read:

  1. Data Structures & Algorithm Analysis in C++, Third Edition, by Clifford A. Shaffer
  2. Algorithms in a Nutshell, by George T. Heineman, Gary Pollice and Stanley Selkow
  3. Algorithms, Fourth Edition, by Robert Sedgewick and Kevin Wayne
  4. Introduction to Algorithms, by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein. I have to say that I have only read most of it, not completely, because some chapters were not interesting for me at the current time, but I will certainly read them later.

As some of my comments are about the presentation of the books, it has to be noted that I have read the three first books on my Kindle.

In this post, you will find my point of view about all these books.

Data Structures & Algorithm Analysis in C++

This book is really great. It contains a lot of data structures and algorithms. Each of them is very clearly presented. It is not hard to understand the data structures and the algorithms.

Each data structure is first presented as an ADT (Abstract Data Structure) and then several possible implementations are presented. Each implementation is precisely defined and analyzed to find its sweet pots and worst cases.  Other implementations are also presented with enough references to know where to start with them.

I have found that some other books about algorithms are writing too much stuff for a single thing. This is not the case with this book. Indeed, each interesting thing is clearly and succinctly explained.

About the presentation, the code is well presented and the content of the book is very well written. A good think would have been to add a summary of the most important facts about each algorithm and data structure. If you want to know these facts, you have to read several pages (but the facts are always here).

The book contains very good explanation about the complexity analysis of algorihtms. It also contains a very interesting chapter about limits to computation where it treats P, NP, NP-Complete and NP-Hard complexity classes.

This book contains a large number of exercises and projects that can be used to improve even more your algorithmic skills. Moreover, there are very good references at the end of each chapters if you want more documentation about a specific subject.

I had some difficulty reading it on my Kindle. Indeed, it's impossible to switch chapters directly with the Kindle button. If you want quick access to the next chapter, you have to use the table of contents.

Algorithms in a Nutshell

This book is much shorter than the previous one. Even if it could be a good book for beginners, I didn't liked this book a lot. The explanations are a bit messy sometimes and it could contain more data structures (even if I know that this is not the subject of the book). The analysis of the different algorithms are a bit short too. Even if it looks normal for a book that short, it has to be known that this book has no exercise.

However, this book has also several good points. Each algorithm is very well presented in a single panel. The complexity of each algorithm is directly given alongside its code. It helps finding quickly an algorithm and its main properties.

Another thing that I found good is that the author included empiric benchmarks as well as complexity analysis. The chapters about Path Finding in AI and computational geometry were very interesting, especially because it is not widely dealt with in other books.

It also has very good references for each chapter.

This book was perfect to read with Kindle, the navigation was very easy.

Algorithms

This book is a good book, but suffers from several drawbacks regarding to other books. First, the book covers a lot of data structures and algorithms. Then, it also has very good explanations about complexity classes. It also has a lot of exercises. I also liked a lot the chapter about string algorithms that was lacking in previous books.

Most of the time, the explanations are good, but sometimes, I found them quite hard to understand. Moreover, some parts of code are also hard to follow. The author included Java runs of some of programs. In my opinion, this is quite useless, empiric benchmarks could have been useful, but not single runs of the program. Some of the diagrams were also hard to read, but that's perhaps a consequence of the Kindle.

A think that disappointed me a bit is that the author doesn't use big Oh notation. Even, if we have enough information to easily get the Big Oh equivalent, I don't understand why a book about algorithms doesn't use this notation.

Just like the first book, there is no simple view of a given algorithm that contains all the information about an algorithm. Another think that disturbed me is that the author takes time to describe an API around the algorithms and data structures and about the Java API. Again, in my opinion only, it takes a too large portion of the book.

Again, this book was perfect to read with Kindle, the navigation was very easy.

Introduction to Algorithms

This book is the most complete I read about algorithms and data structures by a large factor. It has very complete explanations about complexity analysis: big Oh, Big Theta, Small O. For each data structure and algorithm, the complexity analysis is very detailed and very well explained. The pieces of code are written in a very good pseudo code manner.

As I said before, the complexity analysis are very complete and sometimes very complex. This can be either an advantage or a disadvantage, depending of what you awaits from the book. For example, the analysis is made using several notations Big Oh, Big Theta or even small Oh. Sometimes, it is a bit hard to follow, but it provides very good basis for complexity analysis in general.

The book  was also the one with the best explanations about linear time sorting algorithms. In the other books, I found difficult to understand sorts like counting sort or bucket sort, but in this book, the explanations are very clear. It also includes multithreaded algorithm analysis, number theoretic algorithms, polynomials and a very complete chapter about linear programming.

The book contains a huge number of exercises for each chapters and sub chapters.

This book will not only help you find the best suited algorithm for a given problem, it will also help you understand how to write your own algorithm for a problem or how to analyze deeply an existing solution.

Algorithms Book Wrap-up

As I read all these Algorithms books in order, it's possible that my review is a bit subjective regarding to comparisons to other books.

If you plan to work in C++ and need more knowledge in algorithms and C++, I advice you to read Data Structures & Algorithm Analysis in C++, that is really awesome. If you want a very deep knowledge about algorithm analysis and algorithms in general and have good mathematical basis, you should really take a deep look at Introduction to Algorithms. If you want short introduction about algorithms and don't care about the implementation language, you can read Algorithms in a Nutshell. Algorithms is like a master key, it will gives you good starting knowledge about algorithm analysis and a broad range of algorithms and data structures.

Architexa is available for free - Understand your code base

Architexa is a tool suite that helps a team to document collaboratively a large Java code base. The tool is made for a whole team to understand a code base. The tool is available as an Eclipse plugin.

When several developers are working on a large application, it is not always simple to have a whole view of the application. Even with some documentation of the application code. It is even harder for a new developer that joins the project to know what the code base is about. In all these cases, Architexa will help your team. It can also be useful when you inherit an application.

Starting from today, Architexa is available for free for individuals and for teams of up to three developers. You can read the official announce at the end of the article.

My Review of Architexa

I tried Architexa on several of my current Java Projects, but never in team. So perhaps my point of view is not very accurate regarding to general users of the tool. I made my tests using Eclipse Juno.

However, even when working alone on a project, I think that this tool is very useful.

The installation is very straightforward, you just have to use the update site directly in Eclipse. Then, you have several new options in the EDI to use Architexa features.

Three diagrams are available in the Architexa tool suite:

  • Class Diagram : This diagram can be automatically generated for a package, or several packages.
  • Sequence Diagram : You can create Sequence Diagrams for some of your program actions.
  • Layered Diagram : This diagram allows you to represent the architecture of your application. The system allows you to represent several levels of details.

You can easily have several diagrams of each type in your project. You can store them as local files, in a server or in the community server to make them available for everyone.

You can add comment in each diagram. In each diagram you can also access the Javadoc of each class. Of course, you can also access any piece of code from your diagrams.

Advantages

  • Architexa is very simple to use. The tool have access to very good guides directly inside the IDE.
  • The Real-Time Code analysis is awesome. Once something is in a diagram, it is always kept up to date.
  • The sharing features are also great.
  • Even if there are fews diagrams, I think that there are largely enough to have a very good understanding of a code base.
  • All the graphs looks very nice, there are very readable

Drawbacks

  • No support for generics and enums.
  • The tool is only available as an Eclipse plugin. I'm especially using IntelliJ Idea and NetBeans.
  • The tool is only available for Java. There is a prototype for C/C++ that is available on demand, but I didn't tried it at the current time.
  • Sometimes, the creation of a very simple diagram takes a bit long time for my feeling. Creating a diagram with three elements can take several seconds. Perhaps, it is better with larger diagrams. I haven't had the occasion to test it with large code .

Conclusion

To conclude, Architexa is a great tool suite. It is useful for any Java developers that works in a large application. It allows them to have better understanding of its code base.

The official announce: Architexa Tool suite is Now Available for Free

More information on the official site: http://www.architexa.com/