boost::any

In the last post we briefly saw the concept of type erasure. It is one of the cryptic concepts to understand, but is very useful. Lets drill down into what is means to erase the types! Consider:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
#include <vector>
#include <string>

int main()
{
    std::vector<int> vec;
    vec.push_back(1);
    vec.push_back(std::string("a"));

    return 0;    
}

The problem here is you cannot add string to a vector of an int. Why? Type safety, obviously and that’s great. But if we want a heterogeneous container, then? The obvious solution that comes to mind is to create classes representing int and string and derive them from a common base class and create a vector of pointer to that class.

This is not a good solution for two reasons:

  • just to create heterogeneous container we are defining a class hierarchy
  • as more types are to be added the inheritance hierarchy becomes non-sensical and very difficult to maintain and extend. How do we get around this problem? There is a construct in boost library called any. It’s a really cool feature added by Kevlin Henney ~2001.

boost::any

Let’s see the polymorphic hierarchy code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#include <vector>
#include <string>
#include <iostream>

class BaseClass
{
public:
    virtual ~BaseClass()
    {}
};

class IntClass : public BaseClass
{
    int i;
public:
    IntClass(int i) : i(i)
    {}
    ~IntClass()
    {}
};

class StringClass : public BaseClass
{
    std::string str;
public:
    StringClass(const std::string& s) :str(s)
    {}
    ~StringClass()
    {}
};

int main()
{
  //  pointer to objects for polymorphism
  //
  std::vector<BaseClass*> vec;
  vec.push_back( new IntClass(0));
  vec.push_back( new StringClass("Hello"));

  for(auto iter = vec.begin(), end = vec.end(); iter != end; ++iter )
  {
      if( dynamic_cast<IntClass*>(*iter) != NULL)
      {
          std::cout << "An integer" << std::endl;
      }
      else if( dynamic_cast<StringClass*>(*iter) != NULL )
      {
          std::cout << "A String" << std::endl;
      }
  }
  
  return 0;

}

Same code with boost::any

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <boost/any.hpp>    
#include <vector>
#include <string>
#include <iostream>

int main()
{
    std::vector<::boost::any> vec;
    vec.push_back(1);
    vec.push_back(std::string("a"));

    for(auto iter = vec.begin(), end = vec.end(); iter != end; ++iter )
    {
        if( (*iter).type() == typeid(int) )
        {
            std::cout << "An integer" << std::endl;
        }
        else if( iter->type() == typeid(std::string) )
        {
            std::cout << "A String" << std::endl;
        }
    }

    return 0;
}

Above code compiles successfully. But here instead of polymorphic hierarchy, we are using an entirely different class as a container of object to be inserted. What does it buy us as compared to earlier solution? Is it type safe? Definitely the first thing it buys us is amount of code to be written and secondly readability, being more expressive about what kind of object can be pushed into vector. any means any but BaseClass* means anything derived from BaseClass only. And it is perfectly type-safe solution, as you can query type of contained object by using type function and object by using any_cast function.

boost::any 101

Some code demonstrating boost::any usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
#include <boost/any.hpp>
#include <vector>
#include <string>
#include <iostream>

int main()
{
    std::vector<::boost::any> vec;
    vec.push_back(1);
    vec.push_back(std::string("a"));
    
    
    //  retrive value stored in any using any-cast
    //  we know first object is an int
    //
    int val = ::boost::any_cast<int>(vec[0]);

    //  strict type checking
    //  this statement will throw bad_any_cast exception
    //
    try
    {
        val = ::boost::any_cast<double>(vec[0]);
    }
    catch(::boost::bad_any_cast& exBadCast)
    {
        std::cout << "Invalid conversion " << exBadCast.what() << std::endl;
    }

    //  if you want double:
    double dval = ::boost::any_cast<int>(vec[0]);


    //  now first element is empty
    //
    vec[0] = ::boost::any();
    if( vec[0].empty() )
    {
        std::cout << "empty value" << std::endl;
    }

    //  now the first element is double!
    //  hmmm, types being changed dynamically, type-erasure at work!
    //
    vec[0] = 5.4;
    if( vec[0].type() == typeid(double) )
    {
        std::cout << "First element is of type double!" << std::endl;
    }
    
    return 0;
}

Lets try with polymorphic classes, why do we want to try this?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
#include <boost/any.hpp>
#include <iostream>

class BaseClass
{
public:
    virtual ~BaseClass()
    {}
};

class DerivedClass : public BaseClass
{
public:
    virtual ~DerivedClass()
    {}
};

int main()
{
    ::boost::any ptr( new DerivedClass());
    try
    {
        BaseClass* ptr2 = ::boost::any_cast<BaseClass*>(ptr);
    }
    catch(::boost::bad_any_cast& exBadCast)
    {
        std::cout << "Polymorphism not supported!, typeid in use" << std::endl;
    }

    return 0;
}

One of the downsides of typeid is that it checks for exact types rather than types in hierarchy, it does not understand polymorphism!

Performance

So far so good, what about performance? Numbers generated, in seconds, are by executing the for loop in boost::any (first) section of this post.

Debug Build:

Loopsboost::any(s)Polymorphic(s)
1Million3.3083.058
10Million32.76030.747

Release Build:

Loopsboost::any(s)Polymorphic(s)
1Million0.1090.125
10Million1.1231.171

Performance is almost equivalent, in debug builds polymorphic hierarchy has slight upper hand, but in case of release it’s boost::any that performs marginally better. Both use the same underlying mechanism for RTTI. However, do note both are important in their own respect. In this case (container of heterogeneous types) we wanted type-erasure, boost::any is much better option compared to polymorphic hierarchy. This aspect of programming where you can relax the type-checking just enough to achieve what you want, but still have type safety, qualifies to be attributed as “type-erasure”. In one of the future posts we will see how to implement such features and other cool stuff we can do with it. BTW function objects in TR1 are implemented using this concept.