Install and Use CLang Static Analyzer on a CMake project
   Posted:


I recently started a bit of work on my compiler (eddic) again. I started by adapting it to build on CLang with libc++. There was some minor adaptions to make it compile, but nothing really fancy. It now compiles and runs fine on LLVM/Clang 3.4 with the last version of libc++. I'm gonna use some features of C++14 in it and I plan to refactor some parts to make it more STL-correct. I also plan to use only CLang on eddic right now, since C++14 support of GCC is not released right now.

I decided it was a good time to try again the CLang static analyzer.

Installation

If, like me, you're using Gentoo, the static analyzer is directly installed with the sys-devel/clang package, unless you disabled the static-analyzer USE flag.

If your distribution does not ship the static analyzer directly with CLang, you'll have to install it manually. To install it from sources, I advise you to follow the Official Installations instruction.

Usage

The usage of CLang static analyzer can be a bit disturbing at first. Most static analysis tools generally takes the sources directly and do their stuff. But that is not how Clang Static Analyzer works. It works as a kind of monitor in top of building the program, using scan-build. When you are analyzing a program, you are also building the program.

For instance, if you are compiling a source file like that:

clang [clang-options] source_file.cpp

you can perform static analysis like that:

scan-build [scan-build-options] clang [clang-options] source_file.cpp

scan-build works by replacing calls to the compiler by calls to ccc-analyzer . This works generally well, but there are some cases where that things get a bit more complicated. That is the case of CMake where the paths to the compiler are hardcoded in the generated makefiles.

For that, you have to run cmake and make with scan-build:

export CCC_CC=clang
export CCC_CXX=clang++
scan-build cmake -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang .
scan-build make

This can take a very long time. On eddic, it is about three times slower than a normal compilation. An important point to note about performance, is that you can run compilations in parallel (-j option of make) and that it is supported by scan-build quite well.

Once analysis is performed, the found bugs are put into an HTML report. By default, the HTML report is created in /tmp/, but you can specificy the folder with -o option of scan-build.

You can enable or disable checker with the -enable-checker and -disable-checker options of scan-build.

Results on eddic

Several versions of Clang ago, I tried the static analyzer on eddic, but it failed on several source files without producing any results. Moreover, at this time, I don't think there was any nice HTML report at this time.

I ran it again on eddic with the last versions. Here is a picture of the generated report:

CLang Static Analyzer eddic results

As you can see, 14 bugs have been found. Unfortunately, none of them is a real bug on my code, but they are not all false positives neither. For instance, here is some unreachable code report:

CLang Static Analyzer eddic bug

It is indeed an unreachable statement, but it is expected, since it is an assert to ensure that the code is unreachable. But that proves that the analysis works ;)

Even if it didn't found anything, this time it worked much better than the last time I checked and the HTML results are just really good.

I hope you found this article interesting. If you happen to have interesting results on your codebase with the CLang static analyzer, I'd be glad to hear about them ;)

Comments

Related posts on a Nikola website
   Posted:


The one thing I missed in Nikola was the lack of Related Posts generation. I solved this during the migration from WordPress to Nikola, by using simple algorithms to generate related posts for each blog post and then display them in the form of a simple widget.

For example, you can see the related posts of this post on the left, just under my Google+ badge.

Here is the workflow that is used: * A simple C++ tool generate a list of related posts in HTML for each posts * The generated HTML code is included in the MAKO template using Python

In this article, I'll show how the related posts are generated and how to include them in your template.

Related Post Generation

It is important to note that it is necessary to cleanup the content of the files before using it: * First, it is necessary to remove all HTML that may be present in the Markdown files. I remove only the HTML tags, not their content. For instance, in <strong>test</strong>, test would be counted, but not strong. The only exception to that, is that the content of preformatted parts (typically some or console output) is completely removed. * It is also necessary to cleanup Markdown, for instance, parentheses and square brackets are removed, but not their content. Same goes for Markdown syntax for bold, italics, ... * Finally, I also remove punctuation.

My related posts algorithm is very simple.

First, I compute the Term Frequency (TF) of each word in each post. The number of times a word is present in a document is represented by tf(w,d). I decided to give a bigger importance to words in the title and the tags, but that is just a matter of choice.

After that, I compute the Inverse Document Frequency (IDF) of each word. This measure allows to filter words like: a, the, and, has, is, ... These words are not really representative of the content of a blog post. The formula for idf is very simple: idf(w) = log(N / (1+ n(w))). n(w) is the number of posts where the word is present. It is a measure of rarity of a word on the complete posts set.

Once we have the two values, we can easily compute the TF-IDF vectors of each blog post. The TF-IDF for a word is simply: tf_idf(w,d) = tf(w, d) * idf(w).

Finally, we can derive the matrix of Cosine similarities between the TF-IDF vectors. The idea of the algorithm is simple: each document is represented by a vector and then the distance between two vectors indicates how related two posts are. The formula for the Cosine similarity is also simple: cs(d1, d2) = dot(d1, d2) / ||d1|| * || d2||. d1 and d2 are two TF-IDF vectors. Once the cosine similarities between each document is computed, we can just take the N most related documents as the "Related Posts" for each blog post.

With this list, the C++ program simply generates an HTML file that will be included in each post by Nikola template. This process is very fast. I have around 200 posts on this blog and the generation takes about 1 second.

Include in template

Once the HTML files are generate, they are included into the website by altering the template and adding their content directly into the web page. Here is the code I use in base.tmpl.

%if post and not post.source_link().startswith('/stories/'):
    <div class="left-sidebar-widget">
        <h3>Related posts</h3>
        <div class="left-sidebar-widget-content">
            <%
                import os
                related_dir = os.getcwd()
                related_path = related_dir + post.source_link() + ".related.html"

                try:
                    with open(related_path, 'r') as f:
                        related_text = f.read()
                        f.close()
                except IOError as e:
                    related_text = "Not generated"
            %>
            ${related_text}
        </div>
    </div>
%endif

You could also display it in post.tmpl as a simple list.

There is a limitation with this code: it only works if the source file has the same name than the slug, otherwise the file is not found. If someone has a solution to get the path to the source file and not the slug version, I'd be glad to have it ;)

Conclusion

The code for the generator is available on the Github repository of my website.

I wrote it in C++ because I don't like Python a lot and because I'm not good at it and it would have taken me a lot more time to include it in Nikola. If I have time and I'm motivated enough, I'll try to integrate that in Nikola.

I hope that could be useful for some people.

Comments

Migrated from Wordpress to Nikola
   Posted:


As you're reading this post, the site has been migrated from WordPress to Nikola and is now hosted on Github. Nikola is a static site generator.

Reasons of the migration

I had several reasons to migrate my website from WordPress to a static site generator:

  1. I was getting tired of WordPress. It not a bad tool, but it is becoming heavier and heavier. I think that one of the biggest problems is that you need tons of plugins to make a fully-functional blog. I had more than 20 plugins. And each time you upgrade WordPress, you run into problems with the addons. In my opinion, while I understand why you need plugins for syntax highlighting for instance, you should not need any plugin for performances or security. Moreover, when you think of it a blog is not dynamic, I write less than a post a week and most bloggers write about once a day, in the computer science's sense, it is not dynamic at all. So why bother with a database ?
  2. I wanted to use my favourite tools for modifying my blog: the shell and vim. I don't think that wysiwyg editors are really adding any value to editing. I am faster writing posts in vim than I'm in a web editor.
  3. I wanted to be able to edit my website offline. With a static generator, as long as you have the files on your computer, you can edit your site and even browse it offline. You can then deploy it on the internet later when you are online.
  4. I wanted to host my blog at Github Pages, for fun! Moreover, I had some uptime issues with my host with quite some downtime in the last months. And it saves me some bucks each year, at least it was not a strong factor.

The quest for a good static blog generator

It has been several months already since I started thinking about migrating my blog. I had quite a hard time to find a suitable blog generator.

I needed something:

  • Simple
  • Completely usable in command line
  • With a WordPress import feature
  • Fast: I didn't wanted to spend a long time generating the website.
  • Actively developed

The problem is that there are tens of static site generator. I considered several of them. The most well known is Jekyll. It really looks fine, but I have to say that I HATE Ruby. I think it is an horrible language with an even more horrible environment. I cannot even have Ruby installed on my computer. So I didn't spend a long time considering Jekyll. I also considered Hyde, which is the evil brother of Jekyll, but I think that it was missing documentation to be completely usable for me. I also though of Pelican, but I was not convinced with it.

I don't know how, but at first I didn't found about Nikola. It was only after some time that I came across Nikola by pure luck. Once I came accross Nikola, I directly was convinced by it. Nikola is written in Python and has a large set of features but still keeps the whole think very simple. Generation of the website is pretty fast. Even though I don't like Python very much, I'm able to stand its environment and if necessary I can hack around a bit. I also considered Hyde, which is the evil brother of Jekyll, but I think that it was missing documentation to be completely usable for me. I also though of Pelican, but I was not convinced with it.

I don't know how, but at first I didn't found about Nikola. It was only after some time that I came across Nikola by pure luck. Once I found about Nikola, I directly was convinced by it. Nikola is written in Python and has a large set of features but still keeps the whole think very simple. Generation of the website is pretty fast. Even though I don't like Python very much, I'm able to stand its environment and if necessary I can hack around a bit. So I decided to try the complete migration.

The migration

Once I decided to migrate to Nikola, I directly started by importing my WordPress site into a Git repository. This process is quite simple, you just have to export and XML dump from WordPress and then import it into Nikola with the *import_wordpress" command. This already downloads the necessary images and resources and create posts and pages corresponding to your site. It also generates some redirections from the old URL scheme to the new one.

However, there is still some manual work to be done. Here is what I had to do after I imported my WordPress site into Nikola:

  • As syntax highlighting was done by a plugin, I had to convert it to Markdown myself. This was quite easy, just a matter of sed.
  • I was not satisfied with the default templates so I enhanced it myself. As I'm a very poor web developer and even poorer web designer, it took me a long time, even if it is a simple one.
  • I wanted to add some visibility to the comments, so I used Disqus API to create Most Popular and Recent Comments widgets.
  • I had to create some redirections by myself for the tags and categories. This was again just a matter of simple shell commands. I filled a bug about it so it'll probably be fixed in the near future.
  • I tried to improve the performances of the generated website, but I'm still gonna work on this later, the calls to Disqus and Google javascripts are the ones that takes the most of the load time. I think that a static site could be even faster.
  • Finally, I really missed the options to have related posts generated for each posts, so I hacked a simple way to include them for each posts. The related posts are generated using a very simple algorithm. I'll soon write a post about how I have done this.

Except from these things, it hasn't been too hard to migrate to Nikola.

Conclusion

Until now I'm really satisfied with Nikola and I hope this will motivate me to write more blog posts in the coming months. I hope you'll find the website as enjoyable as before (or even more :) ).

If you are interested, you can read the source of this blog post.

Even though I tried my best to avoid 404 or problems with the new site, I'm pretty sure there will be some issues in the following weeks. If you happen to found a dead link or some part of the website is not working for you, don't hesitate to comment this post and I'll my best to fix it. If you have suggestions on how to improve the site or have a question about the process of migrating a website from Wordpress to Nikola, I'd be glad to answer you.

Comments

budgetwarrior 0.3.0 - Objective and wish management
   Posted:


I'm pleased to announce the release of another budgetwarrior release, the version 0.3.0.

Changes

This version contains several important changes.

The first one is the addition of a new module to manage objectives. You can add objective with budget objective add). For instance, you can add an objective saying you want to save 10000$ a year or 200$ a month. When you set your objectives, budget warrior computes how well you complete them. For instance, here is the status of my objectives:

Objective Status

Another module has been added to manage wishes. You can add wishes to budgetwarrior (budget wish add) and then budgetwarrior will tell you if it is a good time to buy them. Here is an example of wish status:

Wish Status Wish Status

The diagnostics tells you where the money will be taken: On savings, on year savings or on month savings (ideal case). It also checks the objectives to see if the payment doesn't break the fulfillment of some of them.

For complete diagnostics, it is necessary to you register your fortune (budget fortune check), ideally once a month.

Of course, this is only a tool, you should not only use that to decide when to buy something, but it may have a good point of view ;)

Moreover, the version also have other smaller changes:

  1. When you make an error when creating a new item (expense, earning, ...), the tool now lets you retry without losing what you typed before.
  2. Confirmation messages are now shown after each modification command (delete, add and edit).
  3. The license has been changed from Boost to MIT. The sense is almost the same, but the MIT is more well known and I thought it would be easier for people to know what this means.
  4. There have several changes to the code base, but that doesn't impact the usage of the tool.

Conclusion

I hope you'll found these changes interesting :)

If you are interested by the tool, you can download it on Github: budgetwarrior

  • There is now Gentoo and Arch Linux installation packages available for ease of installation

If you have a suggestion or you found a bug, please post an issue on the github project: https://github.com/wichtounet/budgetwarrior.

If you have any comment, don't hesitate to contact me, either by letting a comment on this post or by email.

Comments

budgetwarrior 0.2.1 - Minor changes and Gentoo ebuild
   Posted:


I've released a new version of budgetwarrior, the release 0.2.1. budgetwarrior is a simple command line application to manage a personal budget.

The version 0.2.1 contains several bug fixes about archived accounts and bug fixes for budget across several years.

The application as well as the source code is available online: https://github.com/wichtounet/budgetwarrior

I've created Gentoo ebuilds for this application. They are available on my Portage overlay: https://github.com/wichtounet/portage-overlay

Gentoo Installation

  • Edit overlays section of /etc/layman/layman.cfg. Here's an example:

overlays: http://www.gentoo.org/proj/en/overlays/repositories.xml http://github.com/wichtounet/portage-overlay/raw/master/repository.xml

  • Sync layman
layman -S
  • Add the overlay:
layman -a wichtounet
  • Install budgetwarrior
emerge budgetwarrior

Conclusion

If you find any issues with the tool, don't hesitate to post an issue on Github. If you have comments about it, you can post a comment on this post or contact me by email.

Comments

Home Server Adventure – Step 3
   Posted:


Here are some news about my home server installation project.In the past, I already installed a server in a custom Norco case. I wanted to replace my QNAP NAS with a better server, the QNAP being too slow and not extensible enough for my needs.

Here is it how it looks right now (sorry about the photo qualitiy :( my phone does not seem to focus anymore...):

My Home Rack of Server

So I replaced my QNAP NAS with a custom-built NAS. Again, I bought a NORCO case, the RPC-4220. This case has 20 SATA/SAS bays. I bought it with a replacement of the SAS backplane by a SATA one. I also ordered some fan replacement to make it less noisy. I installed my 6 hard disk in Raid 5, managed with mdadm, with LVM partitions on top of the array.

I also added an APC UPS which allows me to go through all the minor power issues that there is in my old apartment and which also me about 10 minutes of backup when there is a power outage.

I haven't added a lot of services on the server. I now run Owncloud on the server and that completely replaces my Dropbox account. I also improved by Sabnzbd installation with other newsgroup automation tools.

Not directly related to my rack, but I also installed a custom XBMC server for my TV. It reads from the NAS server. And of course, it runs Gentoo too.

In the future, I'll add a new simple server as a front firewall to manage security a bit more than here and avoid having to configure redirection in my shitty router (which I would like to replace, but there are not a lot of compatible rack router for my ISP unfortunately). It will probably use a Norco case too.

If you have any question about my build, don't hesitate ;)

Comments

Zabbix - Low Level Discovery of cores, CPUs and Hard Disk
   Posted:


Zabbix SSD Status, configured with Low Level Discovery

At home, I'm using Zabbix to monitor my servers, it has plenty of interesting features and can be extended a lot by using User Parameter.

In this post, I'm gonna talk about Low Level Discovery (LLD). If you are only interested in the final result, go the Conclusion section, you can download my template containing all the rules ;)

Low Level Discovery (LLD)

LLD is a feature to automatically discover some properties of the monitored host and create items, triggers and graphs.

By default, Zabbix support three types of item discovery:

  • Mounted filesystems
  • Network interface
  • SNMP's OIDs

The first two are very useful, since they will give you by default, for instance, the free space of each mounted file system or the bandwith going in and out of each network interface. As I only monitor Linux servers, I don't use the last one, but it will eventually interest other people.

Another very interesting thing about this feature is that you can extend it by discovering more items. In this article, I will show how to discover CPUs, CPU Cores and Hard Disk.

The most important part of custom discovery is to create a script on the monitored machines that can "discover" something. It can be any executable, the only thing important is that it outputs data in the correct format. I have to say that the format is quite ugly, but that is probably not very important ;) Here is the output of my hard disk discovery script:

{
"data":[
    {"{#DISKNAME}":"/dev/sda","{#SHORTDISKNAME}":"sda"},
    {"{#DISKNAME}":"/dev/sdb","{#SHORTDISKNAME}":"sdb"},
    {"{#DISKNAME}":"/dev/sdc","{#SHORTDISKNAME}":"sdc"},
    {"{#DISKNAME}":"/dev/sdd","{#SHORTDISKNAME}":"sdd"},
    {"{#DISKNAME}":"/dev/sde","{#SHORTDISKNAME}":"sde"},
    {"{#DISKNAME}":"/dev/sdf","{#SHORTDISKNAME}":"sdf"},
    {"{#DISKNAME}":"/dev/sdg","{#SHORTDISKNAME}":"sdg"},
]
}

You can have as many keys for each discovered items, but the format must remains the same. In the item, trigger and graph prototypes, you will then use {#DISKNAME} or {#SHORTDISKNAME} to use the discovered values.

Once you have created your scripts, you have to register it in the zabbix configuration as a user parameter. For instance, if you use the zabbix daemon, you need these lines in /etc/zabbix/zabbix_agentd.conf:

EnableRemoteCommands=1
...
UnsafeUserParameters=1
...
UserParameter=discovery.hard_disk,/scripts/discover_hdd.sh

Now, when you will create the discovery rule, you can use discovery.hard_disk as the key.

A discovery rule in itself is useful without prototypes, you can create three types of prototypes:

  • Item Prototype: This will create a new item for each discovered entity
  • Trigger Prototype: This will create a new trigger for each discovered entity.
  • Graph Prototype: This will create a graph for each discovered entity.

The most useful are by far the item and trigger prototypes. The biggest problem with graphs is that you cannot create an aggregate graph of each discovered items. For instance, if you record the temperature of your CPU cores, you cannot automatically create a graph with the temperature of each discovered cores. For that, you have to create the graph in each host. Which makes, imho, graph prototypes pretty useless. Anyway...

In the next section, I'll show how I have created discovery rules for Hard Disk, CPU and CPU cores.

Discover Hard Disk

The discovery script is really simple:

#!/bin/bash
disks=`ls -l /dev/sd* | awk '{print $NF}' | sed 's/[0-9]//g' | uniq`
echo "{"
echo "\"data\":["
for disk in $disks
do
    echo "    {\"{#DISKNAME}\":\"$disk\",\"{#SHORTDISKNAME}\":\"${disk:5}\"},"
done
echo "]"
echo "}"

It just lists all the /dev/sdX devices, remove the partition number and remove the duplicates, to have only the hard disk at the end.

I've created several item prototypes for each hard disk. Here are some examples using S.M.A.R.T. (you can download the template with all the items in the Conclusion section):

  • Raw Read Error Rate
  • Spin Up Time
  • SSD Life Left
  • Temperature
  • ...

You may notice that some of them only make sense for SSD (SSD Life Left) and some others do not make any sense for SSD (Spin Up Time). This is not a problem since they will just be marked as Not Supported by Zabbix.

All these datas are collected using the smartctl utility.

I've also created some trigger to indicate the coming failure of an hard disk:

  • SSD Life Left too low
  • Reallocated Sector Count too low
  • ...

I've just used the threshold reported by smartctl, they may be different from one disk manufacturers to another. I don't put a lot of faith on these values, since disk generally fail before going to threshold, but it could be a good indicator anyway.

Discover CPUs

Here is the script to discover CPUs:

#!/bin/bash
cpus=`lscpu | grep "CPU(s):" | head -1 | awk '{print $NF}'`
cpus=$(($cpus-1))
echo "{"
echo "\"data\":["
for cpu in $(seq 0 $cpus)
do
    echo "    {\"{#CPUID}\":\"$cpu\"},"
done
echo "]"
echo "}"

It just uses lscpu and parses its output to find the number of CPU and then create an entry for each CPUs.

I just have one item for each CPU: The CPU Utilization.

I haven't created any trigger here.

Discover CPU Cores

Just before, we discovered the CPUs, but it is also interesting to discover the cores. If you don't have Hyperthreading, the result will be the same. It is especially interesting to get the temperature of each core. Here is the script:

#!/bin/bash
cores=`lscpu | grep "Core(s) per socket:" | awk '{print $NF}'`
cores=$(($cores-1))
echo "{"
echo "\"data\":["
for core in $(seq 0 $cores)
do
    echo "    {\"{#COREID}\":\"$core\"},"
done
echo "]"
echo "}"

It works in the same way as the previous script.

I've only created one item prototype, to get the temperature of each core with lm_sensors.

Wrap-Up

Here are all the UserParameter necessary to make the discovery and the items works:

### System Temperature ###
UserParameter=system.temperature.core[*],sensors|grep Core\ $1 |cut -d "(" -f 1|cut -d "+" -f 2|cut -c 1-4
### DISK I/O###
UserParameter=custom.vfs.dev.read.ops[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$4}'
UserParameter=custom.vfs.dev.read.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$7}'
UserParameter=custom.vfs.dev.write.ops[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$8}'
UserParameter=custom.vfs.dev.write.ms[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$11}'
UserParameter=custom.vfs.dev.io.active[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$12}'
UserParameter=custom.vfs.dev.io.ms[*],cat /proc/diskstats | egrep $1 | head -1 y| awk '{print $$13}'
UserParameter=custom.vfs.dev.read.sectors[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$6}'
UserParameter=custom.vfs.dev.write.sectors[*],cat /proc/diskstats | egrep $1 | head -1 | awk '{print $$10}'
UserParameter=system.smartd_raw[*],sudo smartctl -A $1| egrep $2| tail -1| xargs| awk '{print $$10}'
UserParameter=system.smartd_value[*],sudo smartctl -A $1| egrep $2| tail -1| xargs| awk '{print $$4}'
### Discovery ###
UserParameter=discovery.hard_disk,/scripts/discover_hdd.sh
UserParameter=discovery.cpus,/scripts/discover_cpus.sh
UserParameter=discovery.cores,/scripts/discover_cores.sh

(it must be set in zabbix_agentd.conf)

You also need to give zabbix the right to use sudo with smartctl. For that, you have to edit your /etc/sudoers file and add this line:

ALL ALL=(ALL)NOPASSWD: /usr/sbin/smartctl

Conclusion and Download

I hope that this helps some people to use Low Level Discovery in their Zabbix Monitoring Installation.

LLD eases a lot the creation of multiple items discovery for hosts with different hardware or configuration. However, it has some problems for which I have not yet found a proper solution. First, you have to duplicate the client scripts on each host (or at least have them on a share available from each of them). Then, the configuration of each agent is also duplicated in the configuration of each host. The biggest problem I think is the fact that you cannot automatically create graph with the generated items of each discovered entities. For instance, I had to create a CPU Temperature graph in each of my host. If you have few hosts, like many, it is acceptable, but if you have hundreds of hosts, you just don't do it.

All the scripts and the template export file are available in the zabbix-lld repository. For everything to work, you need the lscpu, lm_sensors and smartmontools utilities.

If you have any question or if something doesn't work (I don't offer any guarantee, but it should work on most recent Linux machines), don't hesitate to comment on this post.

Comments

Thor OS: Boot Process
   Posted:


Some time ago, I started a hobby project: writing a new operating system. I'm not trying to create a concurrent to Linux, I'm just trying to learn some more stuff about operating systems. I'm gonna try to write some posts about this kernel on this blog.

In this post, I'll describe the boot process I've written for this operating system.

Bootloader Step

The first step is of course the bootloader. The bootloader is in the MBR and is loaded by the system at 0x7C00.

I'm doing the bootloading in two stages. The first stage (one sector) print some messages and then load the second stage (one sector) from floppy at 0x900. The goal of doing it in two stages is just to be able to overwrite the bootloader memory by the stage. The second stage loads the kernel into memory from floppy. The kernel is loaded at 0x1000 and then run directly.

The bootloader stages are written in assembly.

Real mode

When the processor, it boots in real mode (16 bits) and you have to setup plenty of things before you can go into long mode (64 bits). So the first steps of the kernel are running in 16 bits. The kernel is mostly written in C++ with some inline assembly.

Here are all the things that are done in this mode:

  1. The memory is inspected using BIOS E820 function. It is necessary to do that at this point since BIOS function calls are not available after going to protected mode. This function gives a map of the available memory. The map is used later by the dynamic memory allocator.
  2. The interrupts are disabled and a fake Interrupt Descriptor Table is configured to make sure no interrupt are thrown in protected mode
  3. The Global Descriptor Table is setup. This table describes the different portion of the memory and what each process can do with each portion of the memory. I have three descriptors: a 32bit code segment, a data segment and a 64bit code segment.
  4. Protected mode is activated by setting PE bit of CR0 control register.
  5. Disable paging
  6. Jump to the next step. It is necessary to use a far jump so that the code segment is changed.

Protected Mode

At this point, the processor is running in protected mode (32 bits). BIOS interrupts are not available anymore.

Again, several steps are necessary:

  1. To be able to use all memory, Physical Address Extensions are activated.
  2. Long Mode is enabled by setting the EFER.LME bit.
  3. Paging is setup, the first MiB of memory is mapped to the exact same virtual addresses.
  4. The address of the Page-Map Level 4 Table is set in the CR0 register.
  5. Finally paging is activated.
  6. Jump to the real mode kernel, again by using a far jump to change code segment.

Real Mode

The kernel finally runs in 64 bits.

There are still some initialization steps that needs to be done:

  1. SSE extensions are enabled.
  2. The final Interrupt Descriptor Table is setup.
  3. ISRs are created for each possible processor exception
  4. The IRQs are installed in the IDT
  5. Interrupts are enabled

At this point, is kernel is fully loaded and starts initialization stuff like loading drivers, preparing memory, setting up timers...

If you want more information about this process, you can read the different source files involved (stage1.asm, stage2.asm, boot_16.cpp, boot_32.cpp and kernel.cpp) and if you have any question, you can comment on this post.

Comments

New hobby project: Thor-OS, 64bit Operating System in C++
   Posted:


It's been a long time since I have posted on this blog about a project. A bit more than two months ago, I started a new project: thor-os

This project is a simple 64bit operating system, written in C++. After having written a compiler, I decided it could be fun to try with an operating system. And it is fun indeed :) It is a really exciting project and there are plenty of things to do in every directions.

I've also written the bootloader myself, but it is a very simple one. It just reads the kernel from the floppy. loads it in memory and then jumps to it and nothing else.

Features

Right now, the project is fairly modest. Here are the features of the kernel:

  • Serial Text Console
  • Keyboard driver
  • Timer driver (PIT)
  • Dynamic Memory Allocation
  • ATA driver
  • FAT32 driver (Work In progress)
  • Draft of an ACPI support (only for shutdown)

All the commands are accessible with a simple shell integrated directly in the kernel.

Testing

All the testing is made in Bochs and Qemu. I don't have any other computer available to test in real right now but that is something I really want to do. But for now, my bootloader only supports floppy, so it will need to be improved to load the kernel from a disk, since it is not likely that I will have a floppy disk to test :D

Here is a screenshot of the OS in action:

Thor OS Screenshot

Future

The next thing that I will improve is the FAT32 driver to have a complete implementation including creating and writing to files.

After that, I still don't know whether I will try to implement a simple Framebuffer or start implement user space.

As for all my projects, you can find the complete source code on Github: https://github.com/wichtounet/thor-os

Don't hesitate to comment if you have any question or suggestion for this project ;) I will try to write some posts about it on the future, again if you have idea of subject for these posts, don't hesitate. The first will probably be about the boot process.

Comments

Gentoo Tips: Avoid Gnome 3.8 from being emerged automatically
   Posted:


Since Gnome 3.8 has been out in the portage tree, a lot of problems arise when you try to emerge something. If it was only when you update the system, it would be OK, but this arises every time you try to install something.

For instance, if I try to update vim on my system, it tries to update empathy to version 3.8 and then pulls some other dependencies causing blocks and other USE problems. I personally don't think empathy should be emerged when emerging vim. Fortunately, you can disable this behavior by using emerge in this way:

emerge --ignore-built-slot-operator-deps=y ...

With that, when you emerge vim, it doesn't emerge Gnome 3.8. It is very useful if you want to stay with Gnome 3.6 for the moment.

I already used this tip several times. I hope that this will be useful to other people.

Comments