Wednesday, July 29, 2015

.Net and C# Alternative to Matlab

Note: For this post I was paid by the ILNumerics GmbH, but I was allowed and supposed to choose the content freely and express my personal opinion.

In today's post I want to present the class library ILNumerics and share my experiences with it.
ILNumerics is a class library for .Net with the aim of allowing faster and easier implementations of numerical calculations, plots etc. Of course one could do this in theory with .Net means alone, but this could turn out to be very tedious (just think of the representation of vectors, multiplying matrices etc.). With ILNumerics now a language extension very similar to Matlab is available, with which complex algorithms can be implemented efficiently. In the following I  will now, as stated, give an introduction to this topic and test the product which is subject to a charge but which I rate in total as very good.

Let us start with the installation: The library can be downloaded via the homepage. Available is only a trial version, this fact and the price is the only negative point in my opinion. The program is relatively costly, starting from 89 Euros per month, so probably not suited for the hobby user. But for universities, companies etc., here I can think of ILNumerics as a real alternative to Matlab, especially since I personally like C# and .Net a lot. A free version for students would be highly appreciated.
After the installation the trial version can be tested respectively the version activated. More information you find on the support page. On this you also find a tutorial and a more precise documentation, which also served me as a reference.

The library contains basically two components: The computation engine and the visualization engine. First I will explain the usage of the computation engine, then that of the visualization engine.

To include the computation engine easily we create a new Windows-Forms project and then click on Project - Add New Item in the menu. In the new window we select "Computing Module" and with that add a new .cs file in the form of a computing module to the project. In this already some example methods are implemented which demonstrate the usage of ILNumerics, but which can of course be deleted. We do this but leave the Main method since we want to use the computing module as entry point for the program. We have to set this in the project since also the created Form class contains a main method. We do this by selecting in the project properties the computing module as Application - Startup Object.
The code of the file Computing Module1.cs should look as follows:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using ILNumerics;
using System.IO;

namespace IlNumericsTest
{

    public class Computing_Module1 : ILMath
    {
        public static void Main(params string[] parameter)
        {
        }
    }

}

Let us start with some easy operations for acclimatization.
Matrices are managed by the ILNumerics class ILArray. With this via +, -, *, / the expected calculation operations can be executed, which are matrix addition / subtraction and element wise multiplication / division. The following code creates 3 matrices of the size 10 x 10, the first is filled with ones, the second with zeros and the third with random values between 0 and 1. Then A + C and A * (-1) is outputted in the console.

ILArray<double> A = ones(10, 10);
ILArray<double> B = zeros(10, 10);
ILArray<double> C = rand(10, 10);
Console.WriteLine(A + C);
Console.WriteLine(-A);

Now some more complex examples, I want to show some operators on graphs and networks. Who does not know graphs is referred to Wikipedia, they are an often used data structure in computer science and mathematics. The research area Web Science especially deals with the investigation and analysis of networks (like the internet), for this graphs are used to represent the data, which can
take on huge sizes. Because of the thematic relevance (for example often Matlab is used) I decided to show some examples from this area as for examle the PageRank algorithm of Google.
But first a function for reading in data:

            var file = File.ReadAllText("Graph.txt");
            ILArray AdjMatrix = csvread(file);

In the first line the file is read to a string, which is then read to a matrix via the ILNumerics function csvread(). csvread() splits the read values (for example) by commas. The input file should contain the comma separatued adjacency matrix of a graph, which contains a 1 on position i, j if the nodes i and j are connected by an edge, otherwise 0. For a graph with 4 nodes the file could look as follows:

0, 1, 0, 0
1, 0, 1, 1
0, 1, 0, 1
0, 1, 1, 0

Now we want to determine the Laplacian Matrix out of the adjacency matrix, which contains the degree of node i in the i-th diagonal entry and -1 on position i, j if the nodes i, j, are connected. For this we create a new function. Since ILNumerics puts special emphasis on performance / memory management (or has to put if it wants to handle big data), we here have to watch some things. On the one hand there is not only the presented type ILArray, but also ILInArray, ILOutArray and ILRetArray (and in general also other types than Array). For more information I point to the documentation, here, just briefly, ILInArray is for example used for input parameters, is immutable and is immediately deleted after leaving the corresponding scope (because of memory / performance reasons). Because of this we have to enclose every function in a new scope, which contains the input parameters.
The function for calculating the Laplacian looks as follows:

public static ILRetArray<double> CalcLaplacian(ILInArray<double> AdjMatrix)
{
    using (ILScope.Enter(AdjMatrix))
    {
        ILArray<double> Degrees = sum(AdjMatrix);
        ILArray<double> LaplacianMatrix = Degrees * eye(AdjMatrix.Length, AdjMatrix.Length);
        LaplacianMatrix = LaplacianMatrix - AdjMatrix;
        return LaplacianMatrix;
    }
}

Input parameter is the adjacency matrix. In the first line in the scope the degree of the nodes is calculated, which is done via the function sum(). This sums over all columns of the matrix and then outputs a vector with the result, which equals exactly the sequence of degrees. eye() creates a matrix with ones in the diagonal, multiplying the degree vector and substracting it from the adjacency matrix gives us the desired matrix.

As a next example we calculate the density of a network. This is an important measure in networks, it is calcualted by dividing the number of present edges by the number of possible edges and thus points out how dense and centralized a network is. With a double summation over the matrix dimensions the density can simply be calculated:

public static ILRetArray<double> CalcDensity(ILInArray<double> AdjMatrix)
{
    using (ILScope.Enter(AdjMatrix))
    {
        ILArray<double> NrLinks = sum(sum(AdjMatrix));
        return (double)NrLinks / (AdjMatrix.Length * (AdjMatrix.Length - 1));
    }
}

Now to an algorithm, which made its inventors rich and world famous, but is in principal so easy that we implement it now here (in principal - of course there is much more to it than just these lines of code). I am talking about Google's Pagerank algorithm, which was for a long time the basis of the ranking of websites in the Google search. The algorithm assigns to every page a pagerank (a populariy), which is made up of the pageranks of the pages linking to this page. The formula for calculating it is explained in the linked Wikipedia article, in practise though the pagerank is approximated iteratively: Start with an arbitrary pagerank vector P, then the new pagerank is calculated by P = (1 - d) * e * A'T*P. Here d is a damping factor (e.g. 0.5), e is the unity vector and A' the matrix which results of the adjacency matrix A by substituting all rows with only zeros by rows with the entries 1/n (n = number of nodes). We repeat this until old and new pagerank are sufficiently close, that means the procedure does not change much any more with the next iterations. In ILNumerics this looks as follows:

        public static ILRetArray<double> CalcPageRank(ILArray<double> AdjMatrix)
        {
            using (ILScope.Enter(AdjMatrix))
            {
                ILArray<double> Degrees = sum(AdjMatrix.T);
                double epsilon = 0.00001;
                double d = 0.5;
                for (int i = 0; i < AdjMatrix.Length; i++)
                {
                    for (int j = 0; j < AdjMatrix.Length; j++)
                    {
                        if (Degrees[i] != 0)
                            AdjMatrix[i, j] /= Degrees[i];
                    }
                    if (AdjMatrix["0", ":"].Equals(zeros(1, AdjMatrix.Length)))
                        AdjMatrix["0", ":"] = ones(AdjMatrix.Length) / AdjMatrix.Length;
                }
                ILArray<double> POld = zeros(AdjMatrix.Length);
                ILArray<double> PNew = ones(AdjMatrix.Length);

                do
                {
                    POld = PNew;
                    PNew = (1 - d) * ones(AdjMatrix.Length) + d * multiply(AdjMatrix.T, POld);
                }
                while (norm(POld - PNew, 1) > epsilon);

                return PNew;
            }
        }

So with the previously described method first the adjacency matrix of an arbitrary graph has to be read, the function then calculates for every node in the graph the Pagerank and returns the vector of Pageranks. As one can see, also the selection of submatrices is possible like in Matlab. Matrix["a:b", "c:d"] chooses the submatrix which consists of the rows a - b and the columns c - d of Matrix.

But, like this the Pagerank should not be implemented in ILNumerics. I simply wrote the code in this style first to give an intuitive introduction, and then show how to better use the possibilites of ILNumerics.
In general the usage of custom ILNumerics functions is recommended over the usage of self made ones, like looping with for - loops over big matrices etc. Thus we use the type ILLogical instead of looping over the matrix and testing the rows for 0. This gives us a vector which describes which elements from some given vector fulfill a certain condition. We choose as a condition Degrees = 0, then select the submatrix with the corresponding rows and replace them by 1/n. We do the same to divide entries of nodes with degree unequal to zero by n. Further we pull unneccessary calculation operations out of the loop, like generating a vector of ones and transposing the matrix. The resulting code is much more simpler and quicker and looks as follows:

        public static ILRetArray<double> CalcPageRank(ILArray<double> AdjMatrix)
        {
            using (ILScope.Enter(AdjMatrix))
            {
                ILArray<double> Degrees = sum(AdjMatrix, 1);
                double epsilon = 0.00001;
                double d = 0.5;

                ILLogical dummy = Degrees == 0;

                AdjMatrix[dummy, full] = 1.0 / AdjMatrix.Length;
                dummy = Degrees != 0;

                AdjMatrix[dummy, full] = AdjMatrix[dummy, full] / Degrees[dummy];
                AdjMatrix = AdjMatrix.T;

                ILArray<double> POld = zeros(AdjMatrix.Length);
                ILArray<double> PNew = ones(AdjMatrix.Length);
                ILArray<double> ILOnes = (1.0 - d) * ones(AdjMatrix.Length);

                do
                {
                    POld = PNew;
                    PNew = ILOnes + d * multiply(AdjMatrix, POld);
                }
                while (norm(POld - PNew, 1) > epsilon);

                return PNew;
            }
        }

This algorithm implemented in Matlab takes about 5s for a graph of size 20 MB (about 3000 nodes), in ILNumerics as well. Such short running times of course do not give a reliable performance measure, but I think the two tools prove to be comparable. According to the producers ILNumerics has a somewhat bigger overhead, but is supposed to be faster for longer running times.

To conclude the topic about the computation engine here an example of how to solve a linear equation system:

            ILArray<double> A = ILMath.zeros(3, 3);
            A["0;:"] = new double[] { 1, 2, 1 };
            A["1;:"] = new double[] { 3, 4, 0 };
            A["2;:"] = new double[] { -1, 0, 1 };

            ILArray<double> B = new double[] { 5, 4, 7};

            ILArray<double> x = ILMath.linsolve(A, B);

Let us come to the visualization engine. For this there are of course many many possibilities and settings to create the desired plots, therefore I will here just give a brief introduction and point out to the online documentation for the rest.
As also for the computation engine we here follow the quick start guide and create a new Windows-Form project. Then we add via Project - Add New Item a new Plotting form. In the file Program.cs we change the line Application.Run(new Form1()); to Application.Run(new Plotting_Form1()); to tell the program to start with the plotting form.
This already contains example plots, but we first delete the code to create a new plot from scratch. The plotting form provides a control named ilPanel1, in its Load() function we create a simple line plot with 6 points when starting the form:

        private void ilPanel1_Load(object sender, EventArgs e)
        {
            var scene = new ILScene();
            // create some data

            ILArray<float> A = new float[] { 1, 2, 3, 4, -1, -2 };
            // add a plot cube
            scene.Add(new ILPlotCube {
  new ILLinePlot(A)
});
            ilPanel1.Scene = scene;
        }

Basic components are so called scenes. We here create a new one and add to this then a new lineplot with the given data. The result looks as follows:


By default the plotting area can be moved, rescaled etc. For this it has to be redrawn.









This shall already be enough, here some interesting plots of the homepage which also can be implemented quickly:



As a conclusion I can once again say that I see as a negative point the price, but that I otherwise liked the product a lot. It provides a fast and easy to use calculation and visualization toolkit and I am also convinced of the performance of the code. In ILNumerics see a real alternative to Matlab and would be happy to see it being used more often.

3 comments:

  1. Thanks for the great post! It would be interesting to see the Matlab code? There are many ways to do 'unfair' comparisons but basically only one way for a fair one.

    ReplyDelete
  2. Sure, good point!
    I will publish the Matlab Code, but I felt this post would become too full, so I decided to start an own posting series about Matlab, there I will include it then.

    ReplyDelete
  3. Great! Looking forward to read the new series.

    ReplyDelete