Quantcast
Channel: Bartek's coding blog
Viewing all 325 articles
Browse latest View live

How a weak_ptr might prevent full memory cleanup of managed object

$
0
0

Weak pointer and shared pointer

When I was working on the Smart Pointer Reference Card I run into a quite interesting issue. It appears that in some cases memory allocated for the object controlled by smart_ptr might not be released until all weak pointers are also ‘dead’.

Such case was surprising to me because I thought that the moment the last share_ptr goes down, the memory is released.

Let’s drill down the case. It might be interesting as we’ll learn how shared/weak pointers interact with each other.

The case

Ok, so what’s the problem?

First, we need to see the elements of the case:

  • a managed object, let’s assume it’s big
  • shared_ptr (one or more) - they point to the above object (resource)
  • make_shared - used to create a shared pointer
  • weak_ptr
  • the control block of shared/weak pointers

The code is simple:

Shared pointers to our large object go out of the scope. The reference counter reaches 0, and the object can be destroyed. But there’s also one weak pointer that outlives shared pointers.

weak_ptr<MyLargeType> weakPtr;
{
auto sharedPtr = make_shared<MyLargeType>();
weakPtr
= sharedPtr;
// ...
}
cout
<<"scope end...\n";

In the above code we have two scopes: inner - where the shared pointer is used, and outer - with a weak pointer (notice that this weak pointer holds only a ‘weak’ reference, it doesn’t use lock() to create a strong reference).

When the shared pointer goes out the scope of the inner block it should destroy the managed object… right?

This is important: when the last shared pointer is gone this destroys the objects in the sense of calling the destructor of MyLargeType… but what about the allocated memory? Can we also release it?

To answer that question let’s consider the second example:

weak_ptr<MyLargeType> weakPtr;
{
shared_ptr
<MyLargeType> sharedPtr(newMyLargeType());
weakPtr
= sharedPtr;
// ...
}
cout
<<"scope end...\n";

Almost the same code… right? The difference is only in the approach to create the shared pointer: here we use explicit new.

Let’s see the output when we run both of those examples.

In order to have some useful messages, I needed to override global new and delete, plus report when the destructor of my example class is called.

void*operatornew(size_t count){
cout
<<"allocating "<< count <<" bytes\n";
return malloc(count);
}

voidoperatordelete(void* ptr) noexcept {
cout
<<"global op delete called\n";
free
(ptr);
}

structMyLargeType{
~MyType(){ cout <<"destructor MyLargeType\n";}

private:
int arr[100];// wow... so large!!!!!!
};

Ok, ok… let’s now see the output:

For make_shared:

allocating 416 bytes
destructor
MyLargeType
scope
end...
global op delete called

and for the explicit new case:

allocating 400 bytes
allocating
24 bytes
destructor
MyLargeType
global op delete called
scope
end...
global op delete called

What happens here?

The first important observation is that, as you might already know, make_shared will perform just one memory allocation. With the explicit new we have two separate allocations.

So we need a space for two things: the object and... the control block.

The control block is implementation depended, but it holds the pointer to an object and also the reference counter. Plus some other things (like custom deleter, allocator, …).

When we use explicit new, we have two separate blocks of memory. So when the last shared pointer is gone, then we can destroy the object and also release the memory.

So we see the output:

destructor MyLargeType
global op delete called

Both the destructor and free() is called - before the scope ends.

However, when a shared pointers is created using make_shared() then the managed object resides in the same memory block as the rest of the implementation details.

Here’s a picture with that idea:

Control block of shared pointers

The thing is that when you create a weak pointer, then inside the control block "weak counter" is usually increased. Weak pointers and shared pointers need that mechanism so that they can answer the question “is the pointer dead or not yet”, (or to call expire() method).

In other words we cannot remove the control block if there’s a weak pointer around (while all shared pointers are dead). So if the managed object is allocated in the same memory chunk, we cannot release memory for it as well - we cannot free just part of the memory block (at least not that easily).

Below you can find some code from MSVC implementation, this code is called from the destructor of shared_ptr (when it’s created from make_shared):

~shared_ptr() _NOEXCEPT
{// release resource
this->_Decref();
}

void_Decref()
{// decrement use count
if(_MT_DECR(_Uses)==0)
{// destroy managed resource,
// decrement weak reference count
_Destroy();
_Decwref();
}
}

void_Decwref()
{// decrement weak reference count
if(_MT_DECR(_Weaks)==0)
{
_Delete_this();
}
}

As you see there’s separation of Destroying the object - that only calls destructor, and Delete_this() - only occurs when the weak count is zero.

Here's the link if you want to play with the code: Coliru example.

Fear not!

The story of memory allocations and clean up is interesting… but does it affect us that much?

Possibly not much.

You shouldn’t stop using make_shared just because of that reason! :)

The thing is that it’s quite a rare situation.

Still, it’s good to know this behaviour and keep it in mind when implementing some complex systems that rely on shared and weak pointers.

For example, I am thinking about the concurrent weak dictionary data structure presented by Herb Sutter: My Favorite C++ 10-Liner | GoingNative 2013 | Channel 9.

Correct me if I’m wrong:

make_shared will allocate one block of memory for the control block and for the widget. So when all shared pointers are dead, the weak pointer will live in the cache… and that will also cause the whole memory chunk to be there as well. (Destructors are called, but memory cannot be released).

To enhance the solution, there should be some additional mechanism implemented that would clean unused weak pointers from time to time.

Sorry for a little interruption in the flow :)
I've prepared a bonus about C++ Smart Pointers, check it out here:

Remarks

After I understood the case I also realized that I’m a bit late with the explanation - others have done it in the past :) Still, I’d like to note things down.

So here are some links to resources that also described the problem:

From Effective Modern C++, page 144:

As long as std::weak_ptrs refer to a control block (i.e., the weak count is greater than zero), that control block must continue to exist. And as long as a control block exists, the memory containing it must remain allocated. The memory allocated by a std::shared_ptr make function, then, can’t be deallocated until the last std::shared_ptr and the last std::weak_ptr referring to it have been destroyed.

Summary

The whole article was a fascinating investigation to do!

Sometimes I catch myself spending too much time on things that maybe are not super crucial. Still, they are engaging. It’s great that I can share this as a blog post :)

The bottom line for the whole investigation is that the implementation of shared and weak pointers is quite complex. When the control block is allocated in the same memory chunk as the managed object, a special care has to be taken when we want to release the allocated memory.

BTW: with this exercise, I needed to look at the code behind shared_ptr… it’s not super simple! Have you seen this? Or maybe you wrote a similar smart pointer?


C++ Status at the end of 2017

$
0
0

C++ Status at the end of 2017

In Poland, it’s only a few hours until the end of the year, so it’s an excellent chance to make a summary of things that happened to C++! As you might guess the whole year was dominated by the finalization and publication of C++17. Yet, there are some other “big” things that happened. Let’s see the full report.

Previous reports: 2016, 2015, 2014, 2013, 2012.

Intro

As usual, at the end of the year, I try to gather essential events that happened in the C++ world.

Here are the main things for this year that got my attention:

  • C++17 and the stable progress of the standardization
  • Transparency of the Committee and compiler vendors
  • Community is growing!
  • More tools!

But read on to get all of the details :)

Timeline

Just to have an overview:

DateEvent
February 2ndIntel C++ Compiler v18.0
February 27th - March 4thKona, HI, USA - ISO C++ Meeting
March 7thVisual Studio 2017 (v 15.0) released
March 13thLLVM/Clang 4.0 Released
MarchC++17 finalized and sent for ISO review
April 25th - 29thACCU 2017
May 2ndGCC 7.1
May 15th - 20thC++ Now 2017
July 10th - 15thToronto, Canada - ISO C++ Meeting
August 14thGCC 7.2 & Visual Studio 2017.3
September 6thC++17 formally approved by ISO and sent for publication
September 7thLLVM/Clang 5.0.0 Released
September 25thBjarne Stroustrup awarded 2017 Faraday Medal
September 23rd - October 1stCppCon 2017
November 9th - 11thMeeting C++ 2017
November 14th - 15thcode::dive conference in Wroclaw, PL
December 4thC++17 has been published as ISO/IEC 14882:2017
December 7thVisual Studio 2017 15.5 Released

C++11/14 compiler status

Before we dive into the newest stuff, let’s recall what’s the status of C++11 and C++14 implementation.

Just for the reference Clang (since 3.4 ), GCC (since 5.0) and Intel (version 15.0) already have full support for C++11/14.

Visual Studio with frequent releases of 2017 (Compiler version 15.5 and 15.6 currently) made significant progress towards implementing the missing parts: Expression SFINAE and Two-phase name lookup. It’s not fully conformant, but very close to reach it. Read more in the VS section below.

So, all in all, we can say that C++11/14 is supported in all major compilers! So you have no excuses not to use modern C++ :)

C++17

The new standard was the main topic for the year.

In December it was published as ISO/IEC 14882:2017 Programming languages – C++. The standard was technically completed in March so at the beginning of the year we knew its full form.

You can also download a free version of the last draft: N4659, 2017-03-21, PDF.

Plus here are my bonus PDFs:

A lot of opinions were expressed about the new standard. Some developers like it, some hoped for something more. Nevertheless, it’s done now, and all we can do is to be happy and just start coding with the new techniques and utilities :)

And, as it appears, it’s relatively easy to jump start into C++17, as most of the major compiler vendors implemented (or are very close) support for the new standard.

See below:

Compiler support for C++17

Full version, and up to date can be found @cppreference: C++17 compiler support , so here I’ll focus on the most important parts:

The original table has confusing/wrong versions for Visual Studio, thanks to a comment from Stephan T. Lavavej I've corrected it using data from the recent VS compiler notes.

FeaturePaperGCCClangMSVCIntel
New auto rules for direct-list-initializationN39225.03.8VS 201517.0
static_assert with no messageN392862.5VS 201718.0
Nested namespace definitionN423063.6VS 2015.317.0
Allow constant evaluation for all non-type template argumentsN426863.6VS 2017 15.5XX
Fold ExpressionsN429563.6VS 2017 15.5XX
Make exception specifications part of the type systemP0012R174VS 2017 15.5XX
Lambda capture of *thisP0018R373.9VS 2017 15.3XX
Dynamic memory allocation for over-aligned dataP0035R474VS 2017 15.5XX
Unary fold expressions and empty parameter packsP0036R063.9VS 2017 15.5XX
__has_include in preprocessor conditionalsP0061R15.0YesVS 2017 15.318.0
Template argument deduction for class templatesP0091R375XXXX
Non-type template parameters with auto typeP0127R274XXXX
Guaranteed copy elisionP0135R174VS 2017 15.6XX
Direct-list-initialization of enumerationsP0138R273.9VS 2017 15.318.0
Stricter expression evaluation orderP0145R374XXXX
constexpr lambda expressionsP0170R175VS 2017 15.3XX
Differing begin and end types in range-based forP0184R063.9VS 201718.0
[[fallthrough]] attributeP0188R173.9VS 201718.0
[[nodiscard]] attributeP0189R173.9VS 2017 15.318.0
[[maybe_unused]] attributeP0212R173.9VS 2017 15.318.0
Structured BindingsP0217R374VS 2017 15.318.0
constexpr if statementsP0292R273.9VS 2017 15.3XX
init-statements for if and switchP0305R173.9VS 2017 15.318.0
Inline variablesP0386R273.9VS 2017 15.5XX
Standardization of Parallelism TSP0024R2XXXXVS 2017 15.5*18.0*
std::string_viewN392174.0VS 2017N/A
std::filesystemP0218R1TSTSTSN/A

As you see most of the bigger features are there!

The problematic parts: parallel STL and filesystem are close to being available.

  • Intel offers their Parallel STL implementation, see: intel/parallelstl
    and they offered it to libstdc++ - Intel offers Parallel STL implementation to GNU libstdc++ : cpp
  • TS - for filesystem means that you have to use std::experimental namespace.
  • N/A for Intel - Intel does not ship with library implementation.
  • Visual Studio 2017.5 started to ship a few of parallel algorithms.
  • Visual Studio Versioning (from comment by Stephan T. Lavavej): The mapping is: 2015 (and all updates) was compiler 19.0, 2017 RTM was 19.10, 2017 15.3 was 19.11, 2017 15.5 is 19.12, and 2017 15.6 will be 19.13.

C++20

Unfortunately, there won’t be C++18 (as I hoped in my April’s post :)).

However, the committee has a stable progress towards C++20. Some features are already voted into the C++20 draft.

As revealed in the paper: Feb 2017 - P0592R0 - “To boldly suggest an overall plan for C++20”. We can expect the following major features:

  • Modules
  • Ranges
  • Concepts
  • Networking

So that’s the “master plan” and a guideline towards the new standard. Of course, it doesn’t mean other things like Coroutines (in fact Coroutines were recently published as TS), contracts or your favourite future feature won’t be approved.

The teams behind popular compilers make massive effort to stay up to date with the standard. In most of the newest versions (like GCC, Clang, VS) you can use most (or all) of C++17… but also a few C++20 features. For example, you can try concepts-lite in GCC; modules support in Clang or MSVC, or Coroutines. Not to mention Ranges.

From this point, it looks that C++20 will be a bit bigger than C++17. Still, it’s important to remember that the Committee prepares a new standard every three years. So don’t expect that they’ll wait for the publication until all features are done (like we needed to wait 10+ years for C++11).

ISO C++ meetings

There were three committee meetings this year - in Kona, Toronto and Albuquerque.
Roughly at the beginning of the year, the committee closed work for C++17 and at the second and the third meeting they started voting features for C++20.

2017-02-27 to 03-04: Kona, HI, USA

During the meeting, C++17 was finalized and sent for the final ISO review.

The committee now shifts to prepare C++20; you can even read some plans here Feb 2017 - P0592R0 - “To boldly suggest an overall plan for C++20”.

The plan is to have at least: modules, ranges, concepts and networking

Here are the trip reports:

2017-07-10 to 15: Toronto, Canada

The first meeting where Committee experts could vote changes into Draft C++20!

For example:

  • Concepts TS was merged into draft C++20
  • Add designated initializers. Draft C++20 now allows code like:
struct A {int x;int y;int z;};
A b
{.x =1,.z =2};

Trip reports:

2017-11-06 to 11: Albuquerque, New Mexico, USA;

A primary goal of the meeting was to address the comments received from national bodies in the Modules TS’s comment ballot that ran this summer

Some new features added to C++20:

Trip reports

Compiler Notes

Current versions and most notable updates.

Visual Studio

Current version VS 2017 update 5 - 15.5.2 - Release notes - December 2017.

Microsoft team made 5 releases of VS 2017! (or 6 if we count RTM Release :))

Update: from Stephan T. Lavavej:
Only 3 were significant toolset updates, though (15.0, 15.3, 15.5; the other releases contained IDE updates and the occasional compiler bugfix).

With the recent version, you can even use some of the parallel algorithms. I did a quick experiment, and it seemed to work:

C++ Parallel STL Algorithms in VS 2017

As you can see in the image above MSVC created a pool of threads, and each thread invoked my lambda. In V15.5 the following algorithms can be invoked in parallel: all_of, any_of, for_each, for_each_n, none_of, reduce, replace, replace_if, sort.

Here are some links to relevant blog posts from VC team. I like the transparency and that they share so much information with us.

GCC

August 14 - GCC 7.2, GCC 7 Release Series Changes

Clang

Current version: 5.0.1 - 21 Dec 2017, Release Notes

If you wonder why LLVM moved “slowly” with versions like 3.3, 3.4, 3.5… and now rapidly went from 4.0 to 5.0 here’s the reason: the versioning scheme changed. Previously major version increased by adding “0.1”, now it’s done by adding “1.0”.

Intel compiler

Tools

This is a brand new section in the summary.

While compilers do the primary job with our C++ code, we cannot forget about the importance of other tools.

Bear in mind that parsing C++ code is a tough task. Thanks to Clang developing tools is now significantly improved and streamlined.

Here are some notable tools that it’s worth to know:

Clang Tools

Clang Power Tools for Visual Studio

IDE and Productivity

Code Analyzers

Package managers

We probably won’t see a standard package manager for C++ (as other languages sometimes provide), but there’s good progress with such tools. Read this article/discussion for more information: Does C++ need a universal package manager? - by (Paul Fultz II).

Anyway recently I started using Conan, and it really works. Previously I had to download the components, install it, set proper links and paths in Visual Studio project. Now all I have to do is to provide an appropriate name of the library in conanfile.txt and run Conan to do all the work. The missing part: not huge list of available packages… but it’s improving.

Visualizers

Code Visualization in Sourcetrail

Sourcetrail (image above). Its objective is to assist with code exploration by creating dynamic graphs that show your project from a different perspective. See my review in this post - Better code understanding with Sourcetrail.

  • The tool is free for non-commercial use!

Also, Cpp Depend offers visualization options for projects: A picture is worth a thousand words: Visualize your C/C++ Projects – CppDepend Blog

Conferences

I am pleased to see that more and more C++ conferences are appearing. We have at least four strong points

  • CppCon
  • C++Now
  • Meeting C++
  • ACCU

But there are more: like Code::Dive, Italian CppCon or Pacific C++ - held for the first time this year!

Just in case here’s the link to ISO C++ page with all registered conferences around the world: Conferences Worldwide, C++ FAQ.

CppCon 2017

Approaching 1200 attendees and 7 tracks

The opening keynote from Bjarne Stroustrup Learning and Teaching Modern C++

Near the same time Bjarne Stroustrup was awarded 2017 Faraday Medal. Congratulations!

Some of the trip reports (more on my github repo)

And one of ITHare reports (more on his blog)

Meeting C++

Schedule.

This year Bjarne Stroustrup gave the opening keynote (“What C++ is and what it will become”). The closing keynote was presented by Louis Dionne (“C++ metaprogramming: evolution and future directions”).

Meeting C++ 2016 Playlist

Code::Dive in Wroclaw, PL

November 14th & 15th, Code::Dive

code::dive is non-profit, annual conference for C++ enthusiasts
sponsored by NOKIA. The main idea behind the conference is to share the
knowledge beyond cutting edge technologies and build networking
between the people.

Mostly about C++ plus other languages like Rust, Go, Python.

This year I attended the conference and here’s my trip report: code::dive 2017 conference report and Adi Shavit’s code::dive Trip Report,

Community

Another strong point of the year: the community is growing! There are so many local C++ groups, slack channel, conferences, blogs, youtube channels… and we even have a podcast: CppCast.

Maybe it’s my personal feeling - I usually track the changes and try to be active in the community - so I feel that growth and vibrancy. Still, I hope other developers can share the same opinion.

I am delighted that my city - Cracow - has now its C++ group! C++ User Group Krakow - join if you’re near the city! :)

Thanks to Jens Weller for giving advice how to start such community, motivation to run them and hosting groups news at Meeting C++ site. See User Groups Worldwide or a news like C++ User Group Meetings in November 2017.

And congratulations for his 5th year of Meeting C++!

Jens also organized r/cpp_review - C++ Library Reviews:

My (Jens) motivation to start this is that with these reviews a community focused on quality in modern C++ could grow, where people are able to learn by example on various libraries. So while more experienced C++ users might be able to give better feedback on the overall design of a library

Please join the C++ Slack channel: https://cpplang.now.sh/

In terms of blogs I highly recommend the following:

And of course set isocpp.org - Standard C++ as your main homepage :)

You can also have a look at this r/cpp thread - Happy New Year r/cpp! and share your thoughts :)

Books

Some of the books released this year worth seeing:

Publication dateTitle and authors
May 15Modern C++ Programming Cookbook– May 15, 2017 by Marius Bancila
Read my review here.
June 28C++17 STL Cookbook by Jacek Galowicz
Read my review here.
August 31Concurrency with Modern C++ by Rainer Grimm
September 18C++ Templates: The Complete Guide (2nd Edition) 2nd Edition David Vandevoorde, Nicolai M. Josuttis, and Douglas Gregor.
Books page here tmplbook.com
September 29Clean C++: Sustainable Software Development Patterns and Best Practices with C++ 17 by Stephan Roth
November 3The C++ Standard Library by Rainer Grimm

I am still waiting for Large-Scale C++ Volume I, John Lakos, it finally should be ready in April 2018! At code::dive John Lakos mentioned that the draft is completed. So hopefully this date won’t be shifted.

Summary

Wow, so many things happened!

Four things that I’d like to emphasize for the year:

  • C++17 and the stable progress of the standardization
  • Transparency of the Committee and compiler vendors
  • Community is growing!
  • More tools!

As I mentioned, in the beginning, the finalization of C++17 set the whole “theme” for the whole year. I like that the 3-year standardization process works and we can expect C++20 without delays. What’s more, the compiler vendors have already implemented most of the C++17 features, so it’s easy to apply new techniques to your projects. I also feel that “we’re all” creating the new language not just “they”. There are many groups or even r/cpp discussions where you can express your thoughts about the new things in the standard. I like such transparency.

There are of course downsides of frequent releases. A lot of C++ code is sometimes even not in C++11 version. A lot of us struggle with the maintenance of legacy code and learning modern standard is not an easy task. During the year I’ve heard an opinion that “real C++” (that we use in most of our projects) is so different than C++ presented in the newest standard. The gap is getting bigger, and bigger and developers might be frustrated (I expressed more thoughts on that topic in my post: How To Stay Sane with Modern C++).

But C++17 was only the part of events in 2017. The community is growing, list of conferences, number of active blogs (with valuable content)… and finally tools are working :) (and they become a crucial part of any development environment). It looks like being a C++ developer is getting more comfortable… a bit :)

So, all in all…. it was not a bad year… right? :)

Your turn

  • What do you think about C++ in 2017?
  • What was the most important event/news for you?
  • Did I miss something? Let me know in comments!

The Pimpl Pattern - what you should know

$
0
0

PIMPL pattern

Have you ever used the pimpl idiom in your code? No matter what’s your answer read on :)

In this article I’d like to gather all the essential information regarding this dependency breaking technique. We’ll discuss the implementation (const issue, back pointer, fast impl), pros and cons, alternatives and also show examples where is it used. You’ll also see how modern C++ can change this pattern. Moreover, I hope you’ll help me and provide your examples.

Intro

A lot has been written about the pimpl pattern. Starting from some old posts by Herb Sutter: GotW #24: Compilation Firewalls and GotW #7b Solution: Minimizing Compile-Time Dependencies.
To some recent ones: GotW #100: Compilation Firewalls and GotW #101: Compilation Firewalls, Part 2 and even a few months ago from Fluent C++ How to implement the pimpl idiom by using unique_ptr. Plus of course tons of other great articles…

So why would I like to write again about pimpl?

First of all, I’d like to make a summary of the essential facts. The pattern is used to break dependencies - both physical and logical of the code.
The basics sound simple, but as usual, there’s more to the story.

There’s also an important question: should we all use pimpl today? Maybe there are better alternatives?

Let’s start with a simple example to set the background:

The basics

Pimpl might appear with different names: d-pointer, compiler firewall or even Cheshire Cat pattern or Opaque pointer.

In its basic form the pattern looks as follows:

  • In a class we move all private members to a newly declared type - like PrivateImpl class
  • it’s only forward declared in the header file of the main class
  • in the corresponding cpp file we declare the PrivateImpl class and define it.
  • now, if you change the private implementation, the client code won't have to be recompiled (as the interface hasn't changed).

So it might look like that (crude, old style code!):

// class.h
classMyClassImpl;
classMyClass{
// ...
voidFoo();
private:
MyClassImpl* m_pImpl;// warning!!!
// a raw pointer! :)
};
// class.cpp
classMyClassImpl
{
public:
voidDoStuff(){/*...*/}
};

MyClass::MyClass()
: m_pImpl(newMyClassImpl())
{}

MyClass::~MyClass(){delete m_pImpl;}

voidMyClass::DoSth(){
m_pImpl
->DoSth();
}

Ech… ugly raw pointers!

So briefly: we pack everything that is private into that forward declared class. We use just one member of our main class - the compiler can work with only the pointer without having full type declaration - as only size of the pointer is needed. Then the whole private declaration & implementation happens in the .cpp file.

Of course in modern C++ it’s also advised to use unique_ptr rather than raw pointers.

The two obvious downsides of this approach: we need a separate memory allocation to store the private section. And also the main class just forwards the method calls to the private implementation.

Ok… but it’s all… right? Not so easy!

The above code might work, but we have to add a few bits to make it work in real life.

More code

We have to ask a few questions before we can write the full code:

  • is your class copyable or only movable?
  • how to enforce const for methods in that private implementation?
  • do you need a “backward” pointer - so that the impl class can call/reference members of the main class?
  • what should be put in that private implementation? everything that’s private?

The first part - copyable/movable relates to the fact that with the simple - raw - pointer we can only shallow copy an object. Of course, this happens in every case you have a pointer in your class.

So, for sure we have to implement copy constructor (or delete it if we want only movable type).

What about that const problem? Can you catch it in the basic example?

If you declare a method const then you cannot change members of the object. In other words, they become const. But it’s a problem for our m_pImpl which is a pointer. In a const method this pointer will also become const which means we cannot assign a different value to it… but… we can happily call all methods of this underlying private class (not only constant)!.

So what we need is a conversion/wrapper mechanism. Something like this:

constMyClassImpl*Pimpl()const{return m_pImpl;}
MyClassImpl*Pimpl(){return m_pImpl;}

And now, in all of our methods of the main class, we should be using that function wrapper, not the pointer itself.

So far, I didn’t mention that “backward” pointer (“q-pointer” in QT terminology). The answer is connected to the last point - what should we put in the private implementation - only private fields? Or maybe even private functions?

The basic code won’t show those practical problems. But In a real application, a class might contain a lot of methods and fields. I’ve seen examples where all of the private section (with methods) go to the pimpl class. Still, sometimes the pimpl class need to call a ‘real’ method of the main class, so we need to provide that “back” pointer. This can be done at construction, just pass the pointer to this.

The improved version

So here’s an improved version of our example code:

// class.h
classMyClassImpl;
classMyClass
{
public:
explicitMyClass();
~MyClass();

// movable:
MyClass(MyClass&& rhs) noexcept;
MyClass&operator=(MyClass&& rhs) noexcept;

// and copyable
MyClass(constMyClass& rhs);
MyClass&operator=(constMyClass& rhs);

voidDoSth();
voidDoConst()const;

private:
constMyClassImpl*Pimpl()const{return m_pImpl.get();}
MyClassImpl*Pimpl(){return m_pImpl.get();}

std
::unique_ptr<MyClassImpl> m_pImpl;
};
// class.cpp
classMyClassImpl
{
public:
~MyClassImpl()=default;

voidDoSth(){}
voidDoConst()const{}
};

MyClass::MyClass(): m_pImpl(newMyClassImpl())
{

}

MyClass::~MyClass()=default;
MyClass::MyClass(MyClass&&) noexcept =default;
MyClass&MyClass::operator=(MyClass&&) noexcept =default;

MyClass::MyClass(constMyClass& rhs)
: m_pImpl(newMyClassImpl(*rhs.m_pImpl))
{}

MyClass&MyClass::operator=(constMyClass& rhs){
if(this!=&rhs)
m_pImpl
.reset(newMyClassImpl(*rhs.m_pImpl));

return*this;
}

voidMyClass::DoSth()
{
Pimpl()->DoSth();
}

voidMyClass::DoConst()const
{
Pimpl()->DoConst();
}

A bit better now.

The above code uses

  • unique_ptr - but see that the destructor for the main class must be defined in the cpp file. Otherwise, the compiler will complain about missing deleter type…
  • The class is movable and copyable, so four methods were defined
  • To be safe with const methods all of the proxy methods of the main class uses Pimpl() method to fetch the proper type of the pointer.

Have a look at this blog Pimp My Pimpl — Reloaded by Marc Mutz for a lot of information about pimpl.

You can play with the full example, live, here (it also contains some more nice stuff to explore)

As you can see, there’s a bit of code that’s boilerplate. That’s why there are several approaches how to wrap that idiom into a separate utility class. Let’s have a look below.

As a separate class

For example Herb Sutter in GotW #101: Compilation Firewalls, Part 2 suggests the following wrapper:

// taken from Herb Sutter
template<typename T>
class pimpl {
private:
std
::unique_ptr<T> m;
public:
pimpl
();
template<typename...Args> pimpl(Args&&...);
~pimpl();
T
*operator->();
T
&operator*();
};

Still, you’re left with the implementation of copy construction if required.

If you want a full blown wrapper take a look at this post PIMPL, Rule of Zero and Scott Meyers by Andrey Upadyshev.

In that article you can see a very advanced implementation of such helper type:

SPIMPL (Smart Pointer to IMPLementation) - a small header-only C++11 library with aim to simplify the implementation of PIMPL idiom
https://github.com/oliora/samples/blob/master/spimpl.h

Inside the library you can find two types: spimpl::unique_impl_ptr - for movable only pimpl, and spimpl::impl_ptr for movable and copyable pimpl wrapper.

Fast pimpl

One obvious point about impl is that a memory allocation is needed to store private parts of the class. If you like to avoid it… and you really care about that memory allocation… you can try:

  • provide a custom allocator and use some fixed memory chunk for the private implementation
  • or reserve a large block of memory in the main class and use placement new to allocate the space for pimpl.
    • Note that reserving space upfront is flaky - what if the size changes? and what’s more important - do you have a proper alignment for the type?

Herb Sutter wrote about this idea here GotW #28: The Fast Pimpl Idiom.

Modern version - that uses C++11 feature - aligned_storage is described here:
My Favourite C++ Idiom: Static PIMPL / Fast PIMPL by Kai Dietrich or Type-safe Pimpl implementation without overhead | Probably Dance blog.

But be aware that it’s only a trick, might not work. Or it might work on one platform/compiler, but not on the other configuration.

In my personal opinion I I don’t see this approach as a good one. Pimp is usually used for larger classes (maybe managers, types in the interfaces of a module), so that additional cost won’t make much.

We’ve seen a few core parts of the pimpl pattern, so we can now discuss it strengths and weaknesses.

Pros and Cons

Pros

  • Provides Compilation Firewall: if the private implementation changes the client code don’t have to be recompiled.
    • Headers can become smaller, as types mentioned only in a class implementation need no longer be defined for client code.
    • So all in all, it might lead to better compilation times
  • Provides Binary Compatibility: very important for library developers. As long as the binary interface stays the same, you can link your app with a different version of a library.
    • To simplify, if you add a new virtual method then the ABI changes, but adding non-virtual methods (of course without removing existing ones) doesn’t change ABI.
    • See Fragile Binary Interface Problem.
  • Possible advantage: No v-table (if the main class contains only non-virtual methods).
  • Small point: Can be used as an object on stack

Cons

  • Performance - one level of indirection is added.
  • A memory chunk has to be allocated (or preallocated) for the private implementation.
    • Possible memory fragmentation
  • Complex code and it requires some discipline to maintain such classes.
  • Debugging - you don’t see the details immediately, class is split

Other issues

  • Testability - there’s opinion that when you try to test such pimpl class, it might cause problems. But as, usually, you test only the public interface it shouldn’t matter.
  • Not for every class. This pattern is often best for large classes at the “interface level”. I don't think vector3d with that pattern would be a good idea…

Alternatives

  • Redesign the code
  • To improve build times:
    • Use precompiled headers
      • Use build caches
      • Use incremental build mode
  • Abstract interfaces
  • COM
    • also based on abstract interfaces, but with some more underlying machinery.

How about modern C++

As of C++17, we don’t have any new features that target pimpl. With C++11 we got smart pointers, so try to implement pimpl with them - not with raw pointers. Plus of course, we get a whole lot of template metaprogramming stuff that helps when declaring wrapper types for the pimpl pattern.

But in the future, we might want to consider two options: Modules and operator dot.

Modules will play an important part in reducing the compilation times. I haven’t played with modules a lot, but as I see using pimpl just for the compilation speed might become less and less critical. Of course, keeping dependencies low is always essential.

Another feature that might become handy is operator dot - designed by Bjarne Stroustrup and Gabriel Dos Reis. PDF - N4477 - didn’t make for C++17, but maybe will see it in C++20?

Basically, it allows to overwrite the dot operator and provide much nicer code for all of the proxy types.

Who is using

I’ve gathered the following examples:

  • QT:
    • This is probably the most prominent examples (that you can find publicly) were private implementation is heavily used.
    • There’s even a nice intro article discussing d-pointers (as they call pimpl): D-Pointer - Qt Wiki
    • QT also shows how to use pimpl with inheritance. In theory, you need a separate pimpl for every derived class, but QT uses just one pointer.
  • OpenSceneGraph
  • Assimp library
    • Exporter
    • Have a look at this comment from assimp.hpp :)
// Holy stuff, only for members of the high council of the Jedi.
classImporterPimpl;

// ...

// Just because we don't want you to know how we're hacking around.
ImporterPimpl* pimpl;
  • Open Office
  • PhysX from Nvidia

It looks like the pattern is used somewhere :)

Let me know if you have other examples.

If you want more examples follow those two questions at stack overflow:

Summary

First a survey:

Pimpl looks simple… but as usual in C++ things are not simple in practice :)

The main points:

  • Pimpl provides ABI compatibility and reduced compilation dependencies.
  • Starting from C++11 you should use unique_ptr (or even shared_ptr) to implement the pattern.
  • To make it working decide if your main class has to be copyable, or just movable.
  • Take care of the const methods so that the private implementation honours them.
  • If the private implementation need to access main class members, then a “back pointer” is needed.
  • Some optimizations are possible (to avoid separate memory allocation), but might be tricky.
  • There are many uses of the pattern in open source projects, QT uses it heavily (with inheritance and back pointer)

Next week I’ll show you a practical example - a utility app - where I use pimpl to break compilation dependencies between classes. Later, the project will also serve as a test project for playing with ABI compatibility. I’ll also use Conan - package manager - to streamline my work when third-party libraries are required.

pimpl Abstract Interface - a practical tutorial

$
0
0

Pimpl vs abstract interface

Let’s see pimpl and its alternatives in a real application! I’ve implemented a small utility app - for file compression - where we can experiment with various designs.

Is it better to use pimpl or maybe abstract interfaces? Read on to discover.

Intro

In my previous post I covered the pimpl pattern. I discussed the basic structure, extensions, pros and cons and alternatives. Still, the post might sound a bit “theoretical”. Today I’d like to describe a practical usage of the pattern. Rather than inventing artificial names like MyClass and MyClassImpl you’ll see something more realistic: like FileCompressor or ICompressionMethod.

Moreover, this will be my first time when I’ve used Conan to streamline the work with third-party libraries (as we need a few of them).

Ok, so what’s the example?

The app - command line file compressor

As an example, I’ve chosen a utility app that helps with packing files.

Basic use case:
Users run this utility app in a console environment. A list of files (or directories) can be passed, as well with the name of the output file. The output file will also specify the given compression method: .zip for zip, .bz2 for BZ compression, etc. Users can also run the app in help mode that will list some basic options and available compression methods. When the compression is finished a simple summary: bytes processed and the final size of the output file is shown.

Run app file compression

Requirements:

  • a console application
  • command line with a few options
    • output file - also specifies the compression method
    • list of files (also with directory support)
  • basic summary at the end of the compression process

The same can be achieved with command line mode of your favourite archive managers (like 7z). Still, I wanted to see how hard is it to compress a file from C++.

The full source code can be found at my GitHub page: GitHub/fenbf/CompressFileUtil.

Simple implementation

Let’s start simple.

When I was learning how to use Conan - through their tutorial - I met a helpful library called Poco:

Modern, powerful open source C++ class libraries for building network- and internet-based applications that run on desktop, server, mobile and embedded systems.

One thing I’ve noticed was that it supports Zip compression. So all I have to do for the application is to use the library, and the compression is done.

I came up with the following solution:

Starting from main() and going into details of the implementation:

int main(int argc,char* argv[])
{
auto inputParams =ParseCommandLine(argc, argv);

if(inputParams.has_value())
{
auto params = inputParams.value();

RunCompressor(params);
}
else
ShowHelp();
}

I won’t discuss the underlying implementation of parsing the command line, let’s skip to RunCompressor() instead:

voidRunCompressor(constInputParams& params) noexcept
{
try
{
FileCompressor compressor;
compressor
.Compress(params.m_files, params.m_output);
}
catch(const std::exception& ex)
std
::cerr <<"Error: "<< ex.what()<<'\n';
catch(...)
std
::cerr <<"Unexpected error\n";
}

Ok, so what’s the deal with pimpl or abstract interfaces?

The first iteration has none of them :)

FileCompressor is declared in FileCompressor.h and is directly included by the file with main() (CompressFileUtil.cpp):

#include<Poco/Zip/Compress.h>

classFileCompressor
{
public:
voidCompress(constStringVector& vecFileNames,
const string& outputFileName);

private:
voidCompressZip(constStringVector& vecFileNames,
const string& outputFileName);
voidCompressOneElement(Poco::Zip::Compress& compressor,
const string& fileName);
};

The class is straightforward: just one method Compress where you pass vector of strings (filenames) and the file name of the output archive to create. It will check the output file extension and forward the work to CompressZip (only zip for now):

voidFileCompressor::CompressZip(constStringVector& vecFileNames,
const string& outputFileName)
{
std
::ofstream out(outputFileName, std::ios::binary);
Poco::Zip::Compress compressor(out,/*seekable output*/true);

for(constauto& fileName : vecFileNames)
CompressOneElement(compressor, fileName);

compressor
.close();
}

CompressOneElement() uses Poco’s compressor to do all the magic:

Poco::File f(fileName);
if(f.exists())
{
Poco::Path p(f.path());
if(f.isDirectory())
{
compressor
.addRecursive(p,Poco::Zip::ZipCommon::CL_MAXIMUM,
/*excludeRoot*/true, p.getFileName());
}
elseif(f.isFile())
{
compressor
.addFile(p, p.getFileName(),
Poco::Zip::ZipCommon::CM_DEFLATE,
Poco::Zip::ZipCommon::CL_MAXIMUM);
}
}

Please notice two things:

  • Firstly: all of the private implementation is shown here (no fields, but private methods).
  • Secondly: types from a third party library are included (might be avoided by using forward declaration).

In other words: every time you decide to change the private implementation (add a method or field) every compilation unit that includes the file will have to be recompiled.

Now we’ve reached the main point of this article:

We aim for pimpl or an abstract interface to limit compilation dependencies.

Of course, the public interface might also change, but it’s probably less often than changing the internals.

In theory, we could avoid Poco types in the header - we could limit the number of private methods, maybe implement static free functions in FileCompressor.cpp. Still, sooner or later we’ll end up having private implementation revealed in the class declaration in one way or another.

I’ve shown the basic code structure and classes. But let’s now have a look at the project structure and how those third-party libraries will be plugged in.

Using Conan to streamline the work

The first iteration only implements the part of requirements, but at least the project setup is scalable and a solid background for later steps.

As I mentioned before, with this project I’ve used Conan (Conan 1.0 was released on 10th January, so only a few days ago!) for the first time (apart from some little tutorials). Firstly, I needed to understand where can I plug it in and how can it help.

In short: in the case of our application, Conan does all the work to provide other libraries for the project. We are using some third party libraries, but a Conan package can be much more (and you can create your custom ones).

To fetch a package you have to specify its name in a special file: conanfile.txt (that is placed in your project directory).

It might look as follows:

[requires]
Poco/1.8.0.1@pocoproject/stable

[generators]
visual_studio

Full reference here docs: conanfile.txt

Conan has several generators that do all job for you. They collect information from dependencies, like include paths, library paths, library names or compile definitions, and they translate/generate a file that the respective build system can understand. I was happy to see “Visual Studio Generator” as one of them (your favourite build tools is probably also on the list of Conan’s Generators).

With this little setup the magic can start:

Now, all you have to do is to run (in that folder) the Conan tool and install the packages.

conan install .-s build_type=Debug-if build_debug -s arch=x86

This command will fetch the required packages (or use cache), also get package’s dependencies, install them in a directory (in the system), build the binaries (if needed) and finally generate correct build options (include/lib directories) for your compiler.

Conan with Poco Libraries

In the case of Visual Studio in my project folder\build_debug I’ll get conanbuildinfo.props with all the settings. So I have to include that property file in my project and build it…. and it should work :)

Conan and Visual Studio props

Conan and Visual Studio setting

But why does Conan help here?

Imagine what you would have to do to add another library? Each step:

  • download a proper version of the library
  • download dependencies,
  • build all,
  • install,
  • setup Visual Studio (or another system) and provide the corrects paths…

I hate doing such work. But with Conan replacing libs, playing with various alternatives is very easy.

Moreover, Conan managed to install OpenSSL library - a dependency for Poco - and on Windows building OpenSSL is a pain as far as I know.

Ok… but where can you find all of the libraries?

Have a look here:

Let’s go back to the project implementation.

Improvements, more libs:

The first version of the application uses only Poco to handle zip files, but we need at least two more:

  • Boost program options - to provide an easy way to parse the command line arguments.
  • BZ compression library - I’ve searched for various libs that would be easy to plug into the project, and BZ seems to be the easiest one.

In order to use the libraries, I have to add a proper links/names into conanfile.txt.

[requires]
Poco/1.8.0.1@pocoproject/stable
Boost.Program_Options/1.65.1@bincrafters/stable
bzip2
/1.0.6@conan/stable

Thanks to Bincrafters boost libraries are now divided into separate packages!

Still, boost in general has a dense dependency graph (between the libraries), so the program options library that I needed brought a lot of other boost libs. Still, it works nicely in the project.

We have all the libraries, so we move forward with the project. Let’s prepare some background work for the support of more compression methods.

Compression methods

Since we want to have two methods (and maybe more in the future), it’s better to separate the classes. That will work better when we’d like to add another implementation.

The interface:

classICompressionMethod
{
public:
ICompressionMethod()=default;
virtual~ICompressionMethod()=default;

virtualDataStatsCompress(constStringVector& vecFileNames,
const string& outputFileName)=0;
};

Then we have two derived classes:

  • ZipCompression - converted from the first implementation.
  • BZCompression - BZ2 compression doesn’t provide archiving option, so we can store just one file using that method. Still, it’s common to pack the files first (like using TAR) and then compress that single file. In this implementation, for simplicity, I’ve used Zip (fastest mode) as the first step, and then BZ compresses the final package.

There’s also a factory class that simplifies the process of creating required classes… but I’ll save the details here for now.

Compression methods class structure

We have all the required code, so let’s try with pimpl approach:

pimpl version

The basic idea of the pimpl patter is to have another class “inside” a class we want to divide. That ‘hidden’ class handles all the private section.

In our case, we need CompressorImpl that implements the private details of FileCompressor.

The main class looks like that now:

classFileCompressor
{
public:
explicitFileCompressor();
~FileCompressor();

// movable:
FileCompressor(FileCompressor&& fc) noexcept;
FileCompressor&operator=(FileCompressor&& fc) noexcept;

// and copyable
FileCompressor(constFileCompressor& fc);
FileCompressor&operator=(constFileCompressor& fc);

voidCompress(constStringVector& vecFileNames,
const string& outputFileName);

private:
classCompressorImpl;

constCompressorImpl*Pimpl()const{return m_pImpl.get();}
CompressorImpl*Pimpl(){return m_pImpl.get();}

std
::unique_ptr<CompressorImpl> m_pImpl;
};

The code is longer than in the first approach. This is why we have to do all the preparation code:

  • in the constructor we’ll create and allocate the private pointer.
  • we’re using unique_ptr so destructor must be defined in cpp file in order not to have compilation problem (missing deleter type).
  • the class is move-able and copyable so additional move and copy constructors are required to be implemented.
  • CompressorImpl is forward declared in the private section
  • Pimpl accessors are required to implement const methods properly. See why it’s essential in my previous post.

And the CompressorImpl class:

classFileCompressor::CompressorImpl
{
public:
CompressorImpl(){}

voidCompress(constStringVector& vecFileNames,
const string& outputFileName);
};

Unique pointer for pimpl is created in the constructor of FileCompressor and optionally copied in the copy constructor.

Now, every method in the main class needs to forward the call to the private, like:

voidFileCompressor::Compress(constStringVector& vecFileNames,
const string& outputFileName)
{
Pimpl()->Compress(vecFileNames, outputFileName);
}

The ‘real’ Compress() method decides which Compression method should be used (by the extension of the output file name) and then creates the method and forwards parameters.

Ok… but what’s the deal with having to implement all of that additional code, plus some boilerplate, plus that pointer management and proxy methods… ?

How pimpl broke dependencies?

The reason: Breaking dependencies.

After the core structure is working we can change the private implementation as much as we like and the client code (that includes FileCompressor.h) doesn’t have to be recompiled.

In this project, I’ve used precompiled headers, and what’s more the project is small. But it might play a role when you have many dependencies.

Another essential property of pimpl is ABI compatibility; it’s not important in the case of this example, however. I’ll return to this topic in a future blog post.

Still, what if the whole compression code, with the interface, sit into a different binary, a separate DLL? In that case, even if you change the private implementation the ABI doesn’t change so you can safely distribute a new version of the library.

Implementing more requirements

Ok… so something should work now, but we have two more elements to implement:

  • showing stats
  • showing all available compression methods

How to do it in the pimpl version?

In case of showing stats:

Stats are already supported by compression methods, so we just need to return them.

So we declare a new method in the public interface:

classFileCompressor
{
...
voidShowStatsAfterCompression(ostream& os)const;
};

This will only be a proxy method:

voidFileCompressor::ShowStatsAfterCompression(ostream& os)const
{
Pimpl()->ShowStatsAfterCompression(os);
}

(Here’s the place where this Pimpl accessors kicks in, it won’t allow us to skip const when the private method inside CompressorImpl is declared).

And… at last, the actual implementation:

voidFileCompressor::CompressorImpl
::ShowStatsAfterCompression(ostream& os)const
{
os
<<"Stats:\n";
os
<<"Bytes Read: "<< m_stats.m_bytesProcessed <<"\n";
os
<<"Bytes Saved: "<< m_stats.m_BytesSaved <<"\n";
}

So much code… just for writing a simple new method.

Ok… by that moment I hope you get the intuition how pimpl works in our example. I’ve prepared another version that uses abstract interface. Maybe it’s cleaner and easier to use than pimpl?

Abstract Interface version

If you read the section about compression methods - where ICompressionMethod is introduced, you might get an idea how to add such approach for FileCompressor.

Keep in mind that we want to break physical dependency between the client code. So that’s why we can declare abstract interface, then provide some way to create the actual implementation (a factory?). The implementation will be only in cpp file so that the client code won’t depend on it.

classIFileCompressor
{
public:
virtual~IFileCompressor()=default;

virtualvoidCompress(constStringVector& vecFileNames,const
string
& outputFileName)=0;

static unique_ptr<IFileCompressor>CreateImpl();
};

And then inside cpp file we can create the final class:

classFileCompressor:publicIFileCompressor
{
public:
voidCompress(constStringVector& vecFileNames,
const string& outputFileName) override;
voidShowStatsAfterCompression(ostream& os)const override;

private:
DataStats m_stats;
};

And the factory method:

unique_ptr<IFileCompressor>IFileCompressor::CreateImpl()
{
return unique_ptr<IFileCompressor>(newFileCompressor());
}

Can that work?

How abstract interface broke dependencies?

With abstract interface approach, we got into a situation where the exact implementation is declared and defined in a separate cpp file. So if we change it, there’s no need to recompile clients code. The same as we get with pimpl.

Was it easier than pimpl?

Yes!

No need for special classes, pointer management, proxy methods. When I implemented this is was much cleaner.

Why might it be worse?

ABI compatibility.

If you want to add a new method to the public interface, it must be a virtual one. In pimpl, it can be a normal non-virtual method. The problem is that when you use a polymorphic type, you also get a hidden dependency on its vtable.

Now, if you add a new virtual method vtable might be completely different, so you cannot be sure if that will work in client’s code.

Also, ABI compatibility requires Size and Layout of the class to be unchanged. So if you add a private member, that will change the size.

Comparison

Let’s roughly compare what’s we’ve achieved so far with pimpl and abstract interface.

FeaturepimplAbstract Interface
Compilation firewallYesYes
ABI compatibilityYesNo
How to add a new methodAdd new method in the main class
Implement proxy method
Implement the actual implementation
Add new virtual method into the Interface
Implement the override method in the implementation class
How to add a new private member?Inside pimpl class
Doesn’t affect ABI
Inside the interface implementation
Changes size of the object, so is not binary compatible
OthersQuite not clean
Harder to debug
It’s usually clean
cannot be used as a value on stack

Summary

This was a fun project.

We went from a straightforward implementation to a version where we managed to limit compilation dependencies. Two methods were tested: pimpl and abstract interface.

Personally, I prefer the abstract interface version. It’s much easier to maintain (as it’s only one class + interface), rather than a class that serves as a proxy plus the real private implementation.

What’s your choice?

Moreover, I enjoyed working with Conan as a package manager. It significantly improved the developments speed! If I wanted to test a new library (a new compression method), I just had to find the proper link and update conanfile.txt. I hope to have more occasion to use this system. Maybe even as a producer of a package.

And here I’d like to thank JFrog-Conan for sponsoring and helping in writing this blog post.

But that’s not the end!

Next time I hope to improve the code and return with an example of a separate DLL and see what’s that ABI compatibility… and how that works.

How to propagate const on a member pointer?

$
0
0

propagate_const, C++

Inside const methods all member pointers become constant pointers.
However sometimes it would be more practical to have constant pointers to constant objects.

So how can we propagate such constness?

The problem

Let’s discuss a simple class that keeps a pointer to another class. This member field might be an observing (raw) pointer, or some smart pointer.

classObject
{
public:
voidFoo(){}
voidFooConst()const{}
};

classTest
{
private:
unique_ptr
<Object> m_pObj;

public:
Test(): m_pObj(make_unique<Object>()){}

voidFoo(){
m_pObj
->Foo();
m_pObj
->FooConst();
}

voidFooConst()const{
m_pObj
->Foo();
m_pObj
->FooConst();
}
};

We have two methods Test::Foo and Test::FooConst that calls all methods (const and non-const) of our m_pObj pointer.

Can this compile?

Of course!

So what’s the problem here?

Have a look:

Test::FooConst is a const method, so you cannot modify members of the object. In other words they become const. You can also see it as this pointer inside such method becomes const Test *.

In the case of m_pObj it means you cannot change the value of it (change its address), but there’s nothing wrong with changing value that it’s pointing to. It also means that if such object is a class, you can safely call its non const methods.

Just for the reference:

// value being pointed cannot be changed:
constint* pInt;
intconst* pInt;// equivalent form
// address of the pointer cannot be changed, 
// but the value being pointed can be
int*const pInt;
// both value and the address of the 
// pointer cannot be changed
constint*const pInt;
intconst*const pInt;// equivalent form

m_pObj becomes Object* const but it would be far more useful to have Object const* const.

In short: we’d like to propagate const on member pointers.

Small examples

Are there any practical examples?

One example might be with Controls:

If a Control class contains an EditBox (via a pointer) and you call:

intControl::ReadValue()const
{
return pEditBox->GetValue();
}

auto val = myControl.ReadValue();

It would be great if inside Control::ReadValues (which is const) you could only call const methods of your member controls (stored as pointers).

And another example: the pimpl pattern.

Pimpl divides class and moves private section to a separate class. Without const propagation that private impl can safely call non-const methods from const methods of the main class. So such design might be fragile and become a problem at some point. Read more in my recent posts: here and here.

What’s more there’s also a notion that a const method should be thread safe. But since you can safely call non const methods of your member pointers that thread-safety might be tricky to guarantee.

Ok, so how to achieve such const propagation through layers of method calls?

Wrappers

One of the easiest method is to have some wrapper around the pointer.

I’ve found such technique while I was researching for pimpl (have a look here: The Pimpl Pattern - what you should know).

You can write a wrapper method:

constObject*PObject()const{return m_pObj;}
Object*PObject(){return m_pObj;}

And in every place - especially in const method(s) of the Test class - you have to use PObject accessor. That works, but might require consistency and discipline.

Another way is to use some wrapper type. One of such helpers is suggested in the article Pimp My Pimpl — Reloaded | -Wmarc.

In the StackOverflow question: Propagate constness to data pointed by member variables I’ve also found that Loki library has something like: Loki::ConstPropPtr\

propagate_const

propagate_const is currently in TS of library fundamentals TS v2:
C++ standard libraries extensions, version 2.

And is the wrapper that we need:

From propagate_const @cppreference.com:

std::experimental::propagate_const is a const-propagating wrapper for pointers and pointer-like objects. It treats the wrapped pointer as a pointer to const when accessed through a const access path, hence the name.

As far as I understand this TS is already published, so it will be eventually in C++20.

It’s already available in

Here’s the paper:

N4388 - A Proposal to Add a Const-Propagating Wrapper to the Standard Library

The authors even suggest changing the meaning of the keyword const… or a new keyword :)

Given absolute freedom we would propose changing the const keyword to propagate const-ness.

But of course

That would be impractical, however, as it would break existing code and change behaviour in potentially undesirable ways

So that’s why we have a separate wrapper :)

We can rewrite the example like this:

#include<experimental/propagate_const>

classObject
{
public:
voidFoo(){}
voidFooConst()const{}
};

namespace stdexp = std::experimental;

classTest
{
private:
stdexp
::propagate_const<std::unique_ptr<Object>> m_pObj;

public:
Test(): m_pObj(std::make_unique<Object>()){}

voidFoo(){
m_pObj
->Foo();
m_pObj
->FooConst();
}

voidFooConst()const{
//m_pObj->Foo(); // cannot call now!
m_pObj
->FooConst();
}
};

propagate_const is move constructible and move assignable, but not copy constructable or copy assignable.

Playground

As usual you can play with the code using a live sample:

Summary

Special thanks to author - iloveportalz0r - who commented on my previous article about pimpl and suggested using popagate_const! I haven’t seen this wrapper type before, so it’s always great to learn something new and useful.

All in all I think it’s worth to know about shallow const problem. So if you care about const correctness in your system (and you should!) then propagate_const (or any other wrapper or technique) is very important tool in your pocket.

Factory With Self-Registering Types

$
0
0

Factory with self registering types

Writing a factory method might be simple:

unique_ptr<IType> create(name){
if(name =="Abc")return make_unique<AbcType>();
if(name =="Xyz")return make_unique<XyzType>();
if(...)return...

returnnullptr;
}

Just one switch/if and then after a match you return a proper type.

But what if we don’t know all the types and names upfront? Or when we’d like to make such factory more generic?

Let’s see how classes can register themselves in a factory and what are the examples where it’s used.

Intro

The code shown as the example at the beginning of this text is not wrong when you have a relatively simple application. For example, in my experiments with pimpl, my first version of the code contained:

static unique_ptr<ICompressionMethod>
Create(const string& fileName)
{
auto extension =GetExtension(filename);
if(extension =="zip")
return make_unique<ZipCompression>();
elseif(extension ="bz")
return make_unique<BZCompression>();

returnnullptr;
}

In the above code, I wanted to create ZipCompression or BZCompression based on the extensions of the filename.

That straightforward solution worked for me for a while. Still, if you want to go further with the evolution of the application you might struggle with the following issues:

  • Each time you write a new class, and you want to include it in the factory you have to add another if in the Create() method. Easy to forget in a complex system.
  • All the types must be known to the factory
  • In Create() we arbitrarily used strings to represent types. Such representation is only visible in that single method. What if you’d like to use it somewhere else? Strings might be easily misspelt, especially if you have several places where they are compared.

So all in all, we get strong dependency between the factory and the classes.

But what if classes could register themselves? Would that help?

  • The factory would just do its job: create new objects based on some matching.
  • If you write a new class there’s no need to change parts of the factory class. Such class would register automatically.

It sounds like an excellent idea.

A practical example

To give you more motivation I’d like to show one real-life example:

Google Test

When you use Google Test library, and you write:

TEST(MyModule,InitTest)
{
}

Behind this single TEST macro a lot of things happen!

For starters your test is expanded into a separate class - so each test is a new class.

But then, there’s a problem: you have all the tests, so how the test runner knows about them?

It’s the same problem were’ trying to solve in this post. The classes need to be registered.

Have a look at this code: from googletest/…/gtest-internal.h:

// (some parts of the code cut out)
#define GTEST_TEST_(test_case_name, test_name, parent_class, parent_id)\
class GTEST_TEST_CLASS_NAME_(test_case_name, test_name) \
:public parent_class \
{\
virtualvoidTestBody();\
static::testing::TestInfo*const test_info_ GTEST_ATTRIBUTE_UNUSED_;\
};\
\
::testing::TestInfo*const GTEST_TEST_CLASS_NAME_(test_case_name, test_name)\
::test_info_ =\
::testing::internal::MakeAndRegisterTestInfo(\
#test_case_name, #test_name, NULL, NULL, \
new::testing::internal::TestFactoryImpl<\
GTEST_TEST_CLASS_NAME_
(test_case_name, test_name)>);\
void GTEST_TEST_CLASS_NAME_(test_case_name, test_name)::TestBody()

I cut some parts of the code to make it shorter, but basically GTEST_TEST_ is used in TEST macro and this will expand to a new class. In the lower section, you might see a name MakeAndRegisterTestInfo. So here’s the place where the class registers!

After the registration, the runner knows all the existing tests and can invoke them.

When I was implementing a custom testing framework for one of my projects I went for a similar approach. After my test classes were registered, I could filter them, show their info and of course be able to execute the test suits.

I believe other testing frameworks might use a similar technique.

Flexibility

My previous example was related to unknown types: for tests, you know them at compile time, but it would be hard to list them in one method create.

Still, such self-registration is useful for flexibility and scalability. Even for my two classes: BZCompression and ZipCompression.

Now when I’d like to add a third compression method, I just have to write a new class, and the factory will know about it - without much of intervention in the factory code.

Ok, ok… we’ve discussed some examples, but you probably want to see the details!

So let’s move to the actual implementation.

Self-registration

What do we need?

  • Some Interface - we’d like to create classes that are derived from one interface. It’s the same requirement as a “normal” factory method.
  • Factory class that also holds a map of available types
  • A proxy that will be used to create a given class. The factory doesn’t know how to create a given type now, so we have to provide some proxy class to do it.

For the interface we can use ICompressionMethod:

classICompressionMethod
{
public:
ICompressionMethod()=default;
virtual~ICompressionMethod()=default;

virtualvoidCompress()=0;
};

And then the factory:

classCompressionMethodFactory
{
public:
usingTCreateMethod= unique_ptr<ICompressionMethod>(*)();

public:
CompressionMethodFactory()=delete;

staticboolRegister(const string name,TCreateMethod funcCreate);

static unique_ptr<ICompressionMethod>Create(const string& name);

private:
staticmap<string,TCreateMethod> s_methods;
};

The factory holds the map of registered types. The main point here is that the factory uses now some method (TCreateMethod) to create the desired type (this is our proxy). The name of a type and that creation method must be initialized in a different place.

The implementation of such factory:

map<string,TCreateMethod>CompressionMethodFactory::s_methods;

boolCompressionMethodFactory::Register(const string name,
TCreateMethod& funcCreate)
{
if(auto it = s_methods.find(name); it == s_methods.end())
{// C++17 init-if ^^
s_methods
[name]= funcCreate;
returntrue;
}
returnfalse;
}

unique_ptr
<ICompressionMethod>
CompressionMethodFactory::Create(const string& name)
{
if(auto it = s_methods.find(name); it != s_methods.end())
return it->second();// call the createFunc

returnnullptr;
}

Now we can implement a derived class from ICompressionMethod that will register in the factory:

classZipCompression:publicICompressionMethod
{
public:
virtualvoidCompress() override;

static unique_ptr<ICompressionMethod>CreateMethod(){
return smake_unique<ZipCompression>();
}
static std::string GetFactoryName(){return"ZIP";}

private:
staticbool s_registered;
};

The downside of self-registration is that there’s a bit more work for a class. As you can see we have to have a static CreateMethod defined.

To register such class all we have to do is to define s_registered:

boolZipCompression::s_registered =
CompressionMethodFactory::Register(ZipCompression::GetFactoryName(),
ZipCompression::CreateMethod);

The basic idea for this mechanism is that we rely on static variables. They will be initialized before main() is called.

But can we be sure that all of the code is executed, and all the classes are registered? s_registered is not used anywhere later, so maybe it could be optimized and removed? And what about the order of initialization?

Static var initialization

We might run into two problems:

Order of static variables initialization:

It’s called “static initialization order fiasco” - it’s a problem where one static variable depends on another static variable. Like static int a = b + 1 (where b is also static). You cannot be sure b will be initialized before a. Bear in mind that such variables might be in a different compilation unit.

Fortunately, for us, it doesn’t matter. We might end up with a different order of elements in the factory container, but each name/type is not dependent on other already registered types.

But what about the first insertion? Can we be sure that the map is created and ready for use?

To be certain I’ve even asked a question at SO:
C++ static initialization order: adding into a map - Stack Overflow

Our map is defined as follows:

map<string,TCreateMethod>CompressionMethodFactory::s_methods;

And that falls into the category of Zero initialization. Later, the dynamic initialization happens - in our case, it means all s_registered variables are inited.

So it seems we’re safe here.

You can read more about it at isocpp FAQ and at cppreference - Initialization.

Can s_registered be eliminated?

Fortunately, we’re also on the safe side:

From the latest draft of C++: n4713.pdf [basic.stc.static], point 2:

variable with static storage duration has initialization or a destructor with side effects; it shall not be eliminated even if it appears to be unused.

So the compiler won’t optimize such variable.

Although this might happen when we use some templated version… but more on that later.

Extensions

All in all, it seems that our code should work! :)

For now, I’ve only shown a basic version, and we can think about some updates:

Proxy classes

In our example, I’ve used only a map that holds <name, TCreateMethod - this works because all we need is a way to create the object.

We can extend this and use a “full” proxy class that will serve as “meta” object for the target type.

In my final app code I have the following type:

structCompressionMethodInfo
{
usingTCreateMethod= std::unique_ptr<ICompressionMethod>(*)();
TCreateMethod m_CreateFunc;
string m_Description
;
};

Beside the creation function, I’ve added m_Description. This addition enables to have a useful description of the compression method. I can then show all that information to the user without the need to create real compression methods.

The factory class is now using

staticmap<string,CompressionMethodInfo> s_methods;

And when registering the class, I need to pass the info object, not just the creation method.

boolZipCompression::s_registered =
CompressionMethodFactory::Register(
ZipCompression::GetFactoryName(),
{ZipCompression::CreateMethod,
"Zip compression using deflate approach"
});

Templates

As I mentioned the downside of self-registration is that each class need some additional code. Maybe we can pack it in some RegisterHelper<T> template?

Here’s some code (with just creation method, not with the full info proxy class):

template<typename T>
classRegisteredInFactory
{
protected:
staticbool s_bRegistered;
};

template<typename T>
boolRegisteredInFactory<T>::s_bRegistered =
CompressionMethodFactory::Register(T::GetFactoryName(), T::CreateMethod);

The helper template class wraps s_bRegistered static variable and it registers it in the factory. So now, a class you want to register just have to provide T::GetFactoryName and T::CreateMethod:

classZipCompression:publicICompressionMethod,
publicRegisteredInFactory<ZipCompression>
{
public:
virtualvoidCompress() override {/*s_bRegistered;*/}

static unique_ptr<ICompressionMethod>CreateMethod(){...}
static std::string GetFactoryName(){return"ZIP";}
};

Looks good… right?

But when you run it the class is not being registered!

Have a look at this code @coliru.

But if you uncomment /*s_bRegistered*/ from void Compress() the registration works fine.

Why is that?

It seems that although s_bRegistered is also a static variable, it’s inside a template. And templates are instantiated only when they are used (see odr-use @stackoverlow). If the variable is not used anywhere the compiler can remove it…

Another topic that’s worth a separate discussion.

So all in all, we have to be smarter with the templated helper. I’ll have to leave it for now.

Not using strings a name

I am not happy that we’re still using string to match the classes.

Still, if used with care strings will work great. Maybe they won’t be super fast to match, but it depends on your performance needs. Ideally, we could think about unique ids like ints, hashes or GUIDs.

Some articles to read and extend

Summary

In this post, I’ve covered a type of factory where types register themselves. It’s an opposite way of simple factories where all the types are declared upfront.

Such approach gives more flexibility and removes dependency on the exact list of supported classes from the factory.

The downside is that the classes that want to register need to ask for it and thus they need a bit more code.

Let me know what do you think about self-registration? Do you use it in your projects? Or maybe you have some better ways?

Static Variables Initialization in a Static Library, Example

$
0
0

C++ static variables initialization and linker, one example

This post is motivated by one important comment from my last article about factories and self-registering types:

(me) So the compiler won’t optimize such variable.

Yet, unfortunately, the linker will happily ignore it if linking from a static library.

So… what’s the problem with the linker?

Intro

The main idea behind self-registering types is that each class need to register in the factory. The factory doesn’t know all the types upfront.

In my proposed solution you have to invoke the following code:

boolZipCompression::s_registered =
CompressionMethodFactory::Register(ZipCompression::GetFactoryName(),
ZipCompression::CreateMethod);

s_registered is a static boolean variable in the class. The variable is initialized before main() kicks in and later you have all the types in the factory.

In the above example, we rely on the two things:

  1. The container that is used inside the factory is “prepared” and initialized - so we can add new items.
    *, In other words, the container must be initialized before we register the first type.
  2. The initialization of s_registered is invoked, and the variable is not optimized.

Additionally, we don’t rely on the order of initializations between types. So if we have two classes like “Foo” and “Bar”, the order in which they end up in the factory container doesn’t matter.

I mentioned that the two points are satisfied by the following fact from the Standard:

variable with static storage duration has initialization or a destructor with side effects; it shall not be eliminated even if it appears to be unused.

Moreover, for static variables, Zero initialization is performed before Dynamic initialization: so the map will be initialized first - during Zero initialization, and the s_registered variables are then initialized in the Dynamic part.

But how about linkers and using such approach in static libraries.?

It appears that there are no explicit rules and our classes might not be registered at all!

Example

Let’s consider the following application:

The client app:

#include"CompressionMethod.h"

int main()
{
auto pMethod =CompressionMethodFactory::Create("ZIP");
assert
(pMethod);
return0;
}

The application just asks to create ZIP method. The factory with all the methods are declared and defined in a separate static library:

// declares the basic interface for the methods 
// and the factory class:
CompressionMethod.h
CompressionMethod.cpp
// methods implementation:
Methods.h
Methods.cpp

Notice that in the client app we only include “CompressionMethod.h”.

The effect

In the register() method I added simple logging, so we can see what class is being registered. You could also set a breakpoint there.

I have two compression method implementations: “Zip” and “Bz”.

When all of the files are compiled into one project:

Properly registered static variables

But when I run the above configuration with the static library I see a blank screen… and error:

static variables not initialized

The reason

So why is that happening?

The C++ standard isn’t explicit about the linking model of static libraries. And the linker usually tries to pull unresolved symbols from the library until everything is defined.

All of s_registered variables are not needed for the client application (the linker doesn’t include them in the “unresolved set” of symbols), so they will be skipped, and no registration happens.

This linker behaviour might be a problem when you have a lot of self-registered classes. Some of them register from the static library and some from the client app. It might be tricky to notice that some of the types are not available! So be sure to have tests for such cases.

Solutions

Brute force - code

Just call the register methods.

This is a bit of a contradiction - as we wanted to have self-registering types. But in that circumstance, it will just work.

Brute force - linker

Include all the symbols in the client app.

The negative of this approach is that it will bloat the final exe size.

For MSVC

  • /WHOLEARCHIVE:CompressionMethodsLib.lib - in the additional linker options.

For GCC

  • -whole-archive for LD

This option worked for me, but in the first place I got this:

Factory types, almost registered

While s_registered variables are initialized, it seems that the map is not. I haven’t investigated what’s going on there, but I realized that a simple fix might work:

To be sure that the container is ready for the first addition we can wrap it into a static method with a static variable:

map<string,CompressionMethodInfo>&
CompressionMethodFactory::GetMap()
{
staticmap<string,CompressionMethodInfo> s_methods;
return s_methods;
}

And every time you want to access the container you have to call GetMap(). This will make sure the container is ready before the first use.

“Use Library Dependency Inputs”, MSVC

Any more ideas?

Wrap up

Initialization of static variables is a tricky thing. Although we can be sure about the order of initialization across one application built from object files, it gets even trickier when you rely on symbols from a static library.

In this post, I’ve given a few found ideas how to solve the problem, but be sure to check what’s the best in your situation.

Once again thanks for the feedback on r/cpp for my previous article.

The code for Visual Studio can be found here: fenbf/CompressFileUtil/factory_in_static_lib

What happens to your static variables at the start of the program?

$
0
0

Static Variables in C++

Saying that C++ has simple rules for variables initialization is probably quite risky :) For example, you can read Initialization in C++ is Bonkers : r/cpp to see a vibrant discussion about this topic.

But let’s try with just a small part of variables: static variables.
How are they initialized? What happens before main()(*) ?

Warning:: implementation dependent, see explanations in the post.

Intro

Have a look at the following code where I use a global variable t (nice and descriptive name... right? :)) :

classTest
{
public:
Test(){}
public:
int _a;
};

Test t;// <<

int main()
{
return t._a;
}

What is the value of t._a in main()?
Is the constructor of Test even called?

Let’s run the debugger!

Debugging

I’ll be using Visual Studio 2017 to run my apps. Although the initialization phase is implementation depended, runtime systems share a lot of ideas to match with the standard.

I created a breakpoint at the start of Test::Test() and this is the call stack I got:

test_static.exe!Test::Test() Line 12
test_static.exe!`dynamic initializer for '_t''() Line 20
ucrtbased.dll!_initterm(void(*)() * first, void(*)() * last) Line 22
test_static.exe!__scrt_common_main_seh() Line 251
test_static.exe!__scrt_common_main() Line 326
test_static.exe!mainCRTStartup() Line 17

Wow… the runtime invokes a few functions before the main() kicks in!

The debugger stopped in a place called dynamic initializer for '_t''(). What’s more, the member variable _a was already set to 0.

Let’s look at the steps:

Our global variable t is not constant initialized. Because according to the standard constant initialization @cppreference it should have the form:

static T &ref=constexpr;
static T object=constexpr;

So the following things happen:

For all other non-local static and thread-local variables, Zero initialization takes place.

And then:

After all static initialization is completed, dynamic initialization of non-local variables occurs…

In other words: the runtime initializes our variables to zero and then it invokes the dynamic part.

Zero initialization

I’ve found this short and concise summary of Zero Initialization @MSDN:

  • Numeric variables are initialized to 0 (or 0.0, or 0.0000000000, etc.).
  • Char variables are initialized to ‘\0’.
  • Pointers are initialized to nullptr.
    • Arrays, POD classes, structs, and unions have their members initialized to a zero value.

Out object t is a class instance so that the compiler will initialize its members to zero.

What’s more, global variables might be put into BSS segment of the program. Which means that they don’t take any space on disk. The whole BSS segment is represented by only the length (sum of sizes of all global variables). The section is then cleared (something like memset(bssStart, bssLen, 0)).

For example, looking at the asm output from my code it looks like MSVC put t variable in _BSS:

_BSS    SEGMENT
?t@@3VTest@@A DD 01H DUP (?) ; t
_BSS ENDS

You can read more @cppreference - zero initialization

Dynamic initialization

From the standard 6.6.2 Static initialization “basic.start.static”, N4659, Draft

Together, zero-initialization and constant initialization are called static initialization; all other initialization is dynamic initialization.

In MSVC each dynamic initializer is loaded into arrays of functions:

// internal_shared.h
typedefvoid(__cdecl* _PVFV)(void);
// First C++ Initializer
extern _CRTALLOC(".CRT$XCA") _PVFV __xc_a[];
// Last C++ Initializer
extern _CRTALLOC(".CRT$XCZ") _PVFV __xc_z[];

And later, a method called _initterm invokes those functions:

_initterm(__xc_a, __xc_z);

_initterm just calls every function, assuming it’s not null:

extern"C"void __cdecl _initterm(_PVFV*const first,
_PVFV
*const last)
{
for(_PVFV* it = first; it != last;++it)
{
if(*it ==nullptr)
continue;

(**it)();
}
}

If any of the initializers throws an exception, std::terminate() is called.

Dynamic initializer for t will call its constructor. This is exactly what I’ve seen in the debugger.

On Linux

According to Linux x86 Program Start Up and Global Constructors and Destructors in C++:

There’s a function __do_global_ctors_aux that calls all “constructors” (it’s for C, but should be similar for C++ apps). This function calls constructors that are specified in the .ctors of ELF image.

As I mentioned, the details are different vs MSVC, but the idea of function pointers to constructors are the same. At some point before main() the runtime must call those constructors.

Implementation Dependent

Although non-local variables will be usually initialized before main() starts, it's not guaranteed by the standard. So if your code works on one platform, it doesn't mean it will work on some other compiler, or even version of the same compiler...

From: C++ draft: basic.start.dynamic#4:

It is implementation-defined whether the dynamic initialization of a non-local non-inline variable with static storage duration is sequenced before the first statement of main or is deferred. If it is deferred, it strongly happens before any non-initialization odr-use of any non-inline function or non-inline variable defined in the same translation unit as the variable to be initialized.

Storage and Linkage

So far I’ve used one global variable, but it wasn’t even marked as static. So what is a ‘static’ variable?

Colloquially, a static variable is a variable that its lifetime is the entire run of the program. Such a variable is initialized before main() and destroyed after.

In the C++ Standard 6.7.1 Static storage duration “basic.stc.static”, N4659, Draft:

All variables which do not have dynamic storage duration, do not have thread storage duration, and are not local have static storage duration. The storage for these entities shall last for the duration of the program

As you see, for non-local variables, you don’t have to apply the static keyword to end with a static variable.

We have a few options when declaring a static variable. We can distinguish them by using: storage and linkage:

  • Storage:
    • automatic - Default for variables in a scope.
    • static - The lifetime is bound with the program.
    • thread - The object is allocated when the thread begins and deallocated when the thread ends.
    • dynamic - Per request, using dynamic memory allocation functions.
  • Linkage
    • no linkage - The name can be referred to only from the scope it is in.
    • external - The name can be referred to from the scopes in the other translation units (or even from other languages).
    • internal - The name can be referred to from all scopes in the current translation unit

By default, if I write int i; outside of main() (or any other function) this will be a variable with a static storage duration and external linkage.

Here’s a short summary:

int i;// static storage, external linkage
staticint t;// static storage, internal linkage
namespace{
int j;// static storage, internal linkage
}
constint ci =100;// static storage, internal linkage

int main()
{

}

Although usually, we think of static variables as globals it’s not always the case. By using namespaces or putting statics in a class, you can effectively hide it and make available according to requirements.

Static variables in a class

You can apply static to a data member of a class:

classMyClass
{
public:
...
private:
staticint s_Important;
};

// later in cpp file:
int s_Important =0;

s_Important has a static storage duration and it’s a unique value for all class objects. They have external linkage - assuming class also has external linkage.

Before C++17 each static class data member have to be defined in some cpp file (apart from static const integers…). Now you can use inline variables:

classMyClass
{
public:
...
private:
// declare and define in one place!
// since C++17
inlinestaticint s_Important =0;
};

As I mentioned earlier, with classes (or namespaces) you can hide static variables, so they are not “globals”.

Static variables in functions

There’s also another special case that we should cover: statics in a function/scope:

voidFoo()
{
staticbool bEnable =true;
if(bEnable)
{
// ...
}
}

From cppreference: storage duration

Static variables declared at block scope are initialized the first time control passes through their declaration (unless their initialization is zero- or constant-initialization, which can be performed before the block is first entered). On all further calls, the declaration is skipped.

For example, sometimes I like to use static bEnable variables in my debugging sessions (not in production!). Since the variable is unique across all function invocations, I can switch it back and forth from true to false. The variable can that way enable or disable some block of code: let’s say new implementation vs old one. That way I can easily observe the effects - without recompiling the code.

Wrap up

Although globals/statics sounds easy, I found it very hard to prepare this post. Storage, linkage, various conditions and rules.
I was happy to see the code behind the initialization, so it’s more clear how it’s all done.

Few points to remember:

  • static variable’s lifetime is bound with the program lifetime. It’s usually created before main() and destroyed after it.
  • static variable might be visible internally (internal linkage) or externally (external linkage)
  • at the start static variables are zero-initialized, and then dynamic initialization happens
  • Still... be careful, as Static initializers will murder your family :)

Ah… wait… but what about initialization and destruction order of such variables?
Let’s leave this topic for another time :)
For now, you can read about static in static libraries: Static Variables Initialization in a Static Library, Example.


Simplify code with 'if constexpr' in C++17

$
0
0

Simplify your code with if constexpr in C++17

Before C++17 we had a few, quite ugly looking, ways to write static if (if that works at compile time) in C++: you could use tag dispatching or SFINAE (for example via std::enable_if). Fortunately, that’s changed, and we can now take benefit of if constexpr!

Let’s see how we can use it and replace some std::enable_if code.

Intro

Static if in the form of if constexpr is an amazing feature that went into C++17. Recently @Meeting C++ there was a post where Jens showed how he simplified one code sample using if constexpr: How if constexpr simplifies your code in C++17.

I’ve found two additional examples that can illustrate how this new feature works:

  • Number comparisons
  • Factories with a variable number of arguments

I think those examples might help you to understand the static if from C++17.

But for a start, I’d like to recall the basic knowledge about enable_if to set some background.

Why compile time if?

At first, you may ask, why do we need static if and those complex templated expressions… wouldn’t normal if just work?

Here’s a test code:

template<typename T>
std
::string str(T t)
{
if(std::is_same_v<T, std::string>)// or is_convertible...
return t;
else
return std::to_string(t);
}

The above routine might be some simple utility that is used to print stuff. As to_string doesn’t accept std::string we can test and just return t if it’s already a string. Sound simple… but try to compile this code:

// code that calls our function
auto t = str("10"s);

You might get something like this:

In instantiation of 'std::__cxx11::string str(T) [with T = 
std::__cxx11::basic_string<char>; std::__cxx11::string =
std::__cxx11::basic_string<char>]'
:
required
from here
error
:no matching functionfor call to
'to_string(std::__cxx11::basic_string<char>&)'
return std::to_string(t);

is_same yields true for the type we used (string) and we can just return t, without any conversion… so what’s wrong?

Here’s the main point:

The compiler compiled both branches and found an error in the else case. It couldn’t reject “invalid” code for this particular template instantiation.

So that’s why we need static if, that would “discard” code and compile only the matching statement.

std::enable_if

One way to write static if in C++11/14 is to use enable_if.

enable_if (and enable_if_v since C++14). It has quite strange syntax:

template<bool B,class T =void>
struct enable_if;

enable_if will evaluate to T if the input condition B is true. Otherwise, it’s SFINAE, and a particular function overload is removed from the overload set.

We can rewrite our basic example to:

template<typename T>
std
::enable_if_t<std::is_same_v<T, std::string>, std::string> strOld(T t)
{
return t;
}

template<typename T>
std
::enable_if_t<!std::is_same_v<T, std::string>, std::string> strOld(T t)
{
return std::to_string(t);
}

Not easy… right?

See below how we can simplify such code with if constexpr from C++17. After you read the post, you’ll be able to rewrite out str utility quickly.

Use Case 1 - Comparing Numbers

At first, let’s start with a simple example: close_enough function, that works on two numbers. If the numbers are not floating points (like when we have two ints), then we can just compare it. Otherwise, for floating points, it’s better to use some epsilon.

I’ve found this sample from at Practical Modern C++ Teaser - a fantastic walkthrough of modern C++ features by Patrice Roy. He was also very kind and allowed me to include this example.

C++11/14 version:


template<class T>
constexpr T absolute(T arg){
return arg <0?-arg : arg;
}

template<class T>
constexprenable_if_t<is_floating_point<T>::value,bool>
close_enough
(T a, T b){
return absolute(a - b)<static_cast<T>(0.000001);
}
template<class T>
constexprenable_if_t<!is_floating_point<T>::value,bool>
close_enough
(T a, T b){
return a == b;
}

As you see, there’s a use of enable_if. It’s very similar to out str function. The code tests if the type of input numbers is is_floating_point. Then, the compiler can remove one function from the overload resolution set.

And now, let’s look at the C++17 version:

template<class T>
constexpr T absolute(T arg){
return arg <0?-arg : arg;
}

template<class T>
constexprauto precision_threshold = T(0.000001);

template<class T>
constexprbool close_enough(T a, T b){
ifconstexpr(is_floating_point_v<T>)// << !!
return absolute(a - b)< precision_threshold<T>;
else
return a == b;
}

Wow… so just one function, that looks almost like a normal function. With nearly “normal” if :)

if constexpr evaluates constexpr expression at compile time and then discards the code in one of the branches.

BTW: Can you see some other C++17 features that were used here?

Play with the code

Use case 2 - factory with variable arguments

In the item 18 of Effective Modern C++ Scott Meyers described a method called makeInvestment:

template<typename...Ts>
std
::unique_ptr<Investment>
makeInvestment
(Ts&&... params);

There’s a factory method that creates derived classes of Investment and the main advantage is that it supports a variable number of arguments!

For example, here are the proposed types:

classInvestment
{
public:
virtual~Investment(){}

virtualvoid calcRisk()=0;
};

classStock:publicInvestment
{
public:
explicitStock(const std::string&){}

void calcRisk() override {}
};

classBond:publicInvestment
{
public:
explicitBond(const std::string&,const std::string&,int){}

void calcRisk() override {}
};

classRealEstate:publicInvestment
{
public:
explicitRealEstate(const std::string&,double,int){}

void calcRisk() override {}
};

The code from the book was too idealistic, and didn’t work - it worked until all your classes have the same number and types of input parameters:

Scott Meyers: Modification History and Errata List for Effective Modern C++:

The makeInvestment interface is unrealistic, because it implies that all derived object types can be created from the same types of arguments. This is especially apparent in the sample implementation code, where are arguments are perfect-forwarded to all derived class constructors.

For example if you had a constructor that needed two arguments and one constructor with three arguments, then the code might not compile:

// pseudo code:
Bond(int,int,int){}
Stock(double,double){}
make
(args...)
{
if(bond)
newBond(args...);
elseif(stock)
newStock(args...)
}

Now, if you write make(bond, 1, 2, 3) - then the else statement won’t compile - as there no Stock(1, 2, 3) available! To work, we need something like static if - if that will work at compile time, and will reject parts of the code that don’t match a condition.

Some posts ago, with the help of one reader, we came up with a working solution (you can read more in Bartek’s coding blog: Nice C++ Factory Implementation 2).

Here’s the code that could work:

template<typename...Ts>
unique_ptr
<Investment>
makeInvestment
(const string &name,Ts&&... params)
{
unique_ptr
<Investment> pInv;

if(name =="Stock")
pInv
= constructArgs<Stock,Ts...>(forward<Ts>(params)...);
elseif(name =="Bond")
pInv
= constructArgs<Bond,Ts...>(forward<Ts>(params)...);
elseif(name =="RealEstate")
pInv
= constructArgs<RealEstate,Ts...>(forward<Ts>(params)...);

// call additional methods to init pInv...

return pInv;
}

As you can see the “magic” happens inside constructArgs function.

The main idea is to return unique_ptr<Type> when Type is constructible from a given set of attributes and nullptr when it’s not.

Before C++17

In my previous solution (pre C++17) we used std::enable_if and it looked like that:

// before C++17
template<typenameConcrete,typename...Ts>
enable_if_t<is_constructible<Concrete,Ts...>::value, unique_ptr<Concrete>>
constructArgsOld
(Ts&&... params)
{
return std::make_unique<Concrete>(forward<Ts>(params)...);
}

template<typenameConcrete,typename...Ts>
enable_if_t<!is_constructible<Concrete,Ts...>::value, unique_ptr<Concrete>>
constructArgsOld
(...)
{
returnnullptr;
}

std::is_constructible @cppreference.com - allows us to quickly test if a list of arguments could be used to create a given type.

In C++17 there’s a helper:

is_constructible_v = is_constructible<T,Args...>::value;

So we could make the code shorter a bit…

Still, using enable_if looks ugly and complicated. How about C++17 version?

With if constexpr

Here’s the updated version:

template<typenameConcrete,typename...Ts>
unique_ptr
<Concrete> constructArgs(Ts&&... params)
{
ifconstexpr(is_constructible_v<Concrete,Ts...>)
return make_unique<Concrete>(forward<Ts>(params)...);
else
returnnullptr;
}

We can even extend it with a little logging features, using fold expression:

template<typenameConcrete,typename...Ts>
std
::unique_ptr<Concrete> constructArgs(Ts&&... params)
{
cout
<< __func__ <<": ";
// fold expression:
((cout << params <<", "),...);
cout
<<"\n";

ifconstexpr(std::is_constructible_v<Concrete,Ts...>)
return make_unique<Concrete>(forward<Ts>(params)...);
else
returnnullptr;
}

Cool… right? :)

All the complicated syntax of enable_if went away; we don’t even need a function overload for the else case. We can now wrap expressive code in just one function.

if constexpr evaluates the condition and only one block will be compiled. In our case, if a type is constructible from a given set of attributes, then we’ll compile make_unique call. If not, then nullptr is returned (and make_unique is not even compiled).

Play with the code

Wrap up

Compile-time if is an amazing feature that greatly simplifies templated code. What’s more, it’s much expressive and nicer than previous solutions: tag dispatching or enable_if (SFINAE). Now, you can easily express yours intends similarly to “run-time” code.

In this article, we’ve touched only basic expressions, and as always I encourage you to play more with this new feature and explore.

And going back to our str example:
Can you now rewrite our str function using if constexpr? :)

The C++ Standard Library book - overview & giveaway

$
0
0

Let’s have a quick overview of another book related to Modern C++ and The Standard Library. This time I picked Rainer Grimm’s book the author of the modernescpp blog.

Read more if you’d like to win C++ book bundle! :)

The book

The C++ Standard Library by Rainer Grimm

The C++ Standard Library

What every professional C++ programmer should know about the C++ standard library

The book is available at LeanPub: here’s the link.

And you can find Rainer’s blog at: modernescpp.com

This book comes from the German version (amazon.de link), it was translated into English and then updated with the information about C++14. Later, in the second version of the book, we have descriptions of C++17 features.

In the newest edition you can expect all info about significant STL C++17 changes like string_view, parallel algorithms, std::filesystem, std::any, std::optional and more.

The book is a concise overview of the features, with lots of examples. And as I know from the author it was not an easy task to fit all vital information in around 200 pages.

One note, this book comes as an ebook, but there’s a Korean translation that appeared as a printed version.

Let’s see what’s inside.

The Structure

1. The Standard Library

History and an overview of the Library. Where we are in the standardization process.

2. Utilities

Everything you need to start with STL: pairs and tuples, chrono, smart pointers, type traits and C++17 utils: any, optional and variant.

3. Interface of All Containers

Common functionalities of sequential and associative containers: creation, deletion, size and access.

4. Sequential Container

Basics about arrays, vectors, deques, lists and forward lists.

5. Associative Containers

Information about ordered associative containers (like std::map or std::set) and then unordered (hash maps in the form of std::unordered_map or std::unordered_set).

6. Adaptors for Containers

Stacks Queues and Priority Queues.

7. Iterators

Iterator intro, categories, how to use them.

8. Callable Units

Function objects, functions and lambdas.

9. Algorithms

A quick overview of all useful algorithms: from for_each to sorting, min max, permutations and hashing.

10. Numeric

Random numbers mostly.

11. Strings

How to create and use strings in C++: concatenation, element access, comparisons, searching numeric conversions.

12. String Views

A short chapter about new, non-owning string object - that was introduced in C++17. When they can help and how to use them with the relation to regular strings.

13. Regular Expressions

Regular expressions in the STL were introduced with C++11. This chapter contains a short overview.

14. Input and Output Streams

How to use streams

15. Filesystem library

Basic introduction to the filesystem from C++17

16. Multithreading

Jump start into multithreading (core parts introduced in C++11): memory model, atomics, threads, shared variables, condition variables and tasks.

Summary

Final mark: 4+/5

Pros:

  • A concise overview of the Standard Library
  • A lot of examples
  • Great way to learn STL including C++17
  • Various tips and suggestions spread out through the book

Cons:

  • sometimes code samples might be explained in more details
  • doesn’t look as polished as larger books from standard publishers.
  • only ebook English version

Rainer Grimm’s book is a great way to learn basics of STL, including major changes of C++17. The book is easy to read. It can serve as a quick reference or as an overview of the Standard Library. It might be handy if you just finished some intro book about the language and you look for another step.

I’m also a big fan of self-publishing and Rainer is a great example that you can succeed in such approach.

Also if you look for more about multithreading Rainer has another book solely on that topic. Check it out here: Concurrency with Modern… by Rainer Grimm.

So… if you’re interested in the book… I have good news:

Giveaway

Together with the author - Rainer Grimm - we’d like to offer you 5 (five!) bundles of the books.

All you need to do is to subscribe and leave a comment !!using the giveaway form!!

Answer one or two of those questions:

  • What are your main blockades when learning C++?
  • What are the areas of C++ you’d like (or need) to learn next?

The giveaway ends in two weeks, 9th April in the morning, Poland's Time

The C++ Standard Library by Rainer Grimm, Giveaway

Deprecating Raw Pointers in C++20

$
0
0

Raw pointers in C++

The C++ Standard moves at a fast pace. Probably, not all developers caught up with C++11/14 yet and recently we got C++17. Now it’ time to prepare C++20!
A few weeks ago The C++ Committee had an official ISO meeting in Jacksonville, FL (12-17 March 2018) where they worked hard on the new specification.

Besides many significant things that were discussed at the meeting like modules, concepts, ranges, The C++ Committee accepted one hugely anticipated feature: deprecation of raw pointers!

Intro

If you’d like to read about all of the changes that the Committee did for C++20 you can check various trip reports that appeared recently. For example:

Honestly, I rolled my eyes when I saw the proposal of removing raw pointers! Such task looks so complicated! How do they plan to implement that? And what about backward compatibility that is one of the primary goals of new language releases?

But then I understood how excellent that move really is.

Just to be clear about the spec:

The plan is to deprecate raw pointers in C++20. So you’ll get a warning from a conformant compiler. Later in C++23 or C++26, raw pointers will be removed from the language. See more details under this link.

Reasoning

How many times were you tracing some bug, probably for long hours, before noticing the main reason was just having an invalid pointer?

Of course, knowing that your pointer is invalid is not as easy as it may sound. Even if you delete ptr; and set it to nullptr you’re not safe. A pointer only represents a memory address, so if you assign it to nullptr, there’s no automatic propagation of that change to all the owners or observers of this pointer.

The pointer-specific issues (memory issues, pointer indirection, unsafe calls or memory access, to name a few) are probably one of the main the most frequent reasons why our language C++ is perceived as hard to use.

Have a look at Rust. They make a lot of efforts to make the language reliable. It’s still a systems programming language, compiled to machine code. But Rust offers many safety checks. You can use raw pointers but in only a few places. And most of the time the language gives you better alternatives.

Ok, ok… but raw pointers are useful in a lot of cases! So let’s have a look what the Committee proposes as alternatives:

Alternatives to raw pointers

Here are the main examples where raw pointers are handy, and what can we use from modern C++ to exchange them.

Avoiding copying / aliasing

One of an obvious reason to use pointers is to hold an address of some object so that you can manipulate it without the need to copy. Especially handy for passing to functions:

voidProcess(GameObject* pObj){
pObj
->Generate();
}

Unfortunately, such code is a common "unsafe" place. For example, you often need to check if such input pointer is not null. Otherwise dereferencing an invalid pointer might generate an unexpected crash.

We have a few alternatives here:

  • Pass a value - if your object support move semantics then the copy might not cost much
  • Pass a smart pointer
  • Pass a reference
  • For copyable and assignable references you can use std::reference_wrapper.

For now, you can also consider using gsl::not_null which I described in this post: How not_null can improve your code?.

Polymorphism

References and smart pointers will handle polymorphism. So no worries here.

Dynamic memory allocation

In modern C++ you should avoid using explicit new. You have many tools to simplify that, like std::make_shared, std::make_unique . It’s another case where using a raw pointer is not needed.

std::shared_ptr<int[]> ptrArray(newint[N]);// since C++17

Observing other objects

Using raw pointers for observing other objects is probably the main issue that caused the change in the standard. With raw pointers, you are not sure if the pointer is still valid. Therefore there are many cases where you might encounter an access violation error.

By using smart pointers, you can safely avoid many of such issues. For example, with weak_ptr you can check if the pointer is still alive or not.

void observe(std::weak_ptr<GameObject> pObj)
{
if(auto observePtr = pObj.lock()){
// object is valid
}else{
// invalid
}
}

Nullable objects

Pointers are also used to transfer the information about the results of some operations:

File*Open(){...}

auto f =Open();
if(f)
{
}

Here we have two problems: the same variable is used to store the objects (the file) and also to convey the message if that object is valid or not. With C++17 we have std::optional that is perfectly suited for that role. It’s far more expressive and safer.

Performance

Safety is not cheap, and sometimes we must give a bit of performance to have more checks and validations. However, in C++, a lot of pointers alternatives offer no runtime cost. For example, unique_ptr is safe, and decays to almost nothing, to a raw pointer under the hood. Hence, any memory access made by using this pointer is as cheap as a usage of raw pointer.

Accessing a shared_ptr is also as fast as a raw pointer, but when copying, shared_ptr needs to manage the control block which involves atomic operations.

Wrap up

From my perspective, the step of removing pointers will give us an entirely new language! C++ will be safer and more straightforward to learn. What’s more, we don’t lose any performance as we have alternatives that are also as close to the metal as raw pointers.

The devil lies in details, and the Committee needs to do a lot of work to make the final specification. Maybe we’ll get some new mechanism for dealing with pointers: like deferred_ptr or even some garbage collection mechanisms?

There’s an excellent presentation from Herb Sutter about “Leak Freedom”, and you can watch it here:

What’s your view on that?
Can you live without raw pointers?

Productive C++ Developer, my recent talk

$
0
0

C++ Goodies and tools

A few weeks ago I gave another talk at my local C++ user group. We discussed recent “goodies” from C++ and tools that can increase productivity.

Intro

In my post for the “C++ summary at the end of 2017” I mentioned that we could see a considerable improvement in the area of tooling for the language.

Most of the time we can hear that “C++ is hard”, parsing and analysing it is even harder… yet, maybe we reached the point where we can finally say “we have great tools”? Or at least we have decent tools!

Here are the main topics that I discussed during the talk:

Recent C++ Updates

The talk was just a few days after Jacksonville’s C++ Committee Meeting. Therefore it was a good occasion for me to present some news about the current language status.

In the previous ISO meeting the Committee voted the following major things into the C++20 draft:

Albuquerque, November 2017

  • operator<=> (aka the spaceship operator) and library support for operator<=>
  • Range-based for with initializer
  • Apply [[nodiscard]] to the standard library - P0600R1
  • std::osyncstream
  • constexpr std::complex
  • constexpr algorithms
  • Floating point std::atomics
  • std::string/std::string_view.starts_with() and .ends_with()

And in the recent meeting we got:

  • Make typename optional in more places
  • [[likely]] , [[unlikely]] and [[no_unique_address]] - attributes
  • <version> header
  • Calender and timezone library - big and nice addition to STL - you can find some news here:
  • syncstream manipulators for C++ Synchronized Buffered Ostream
  • span

More info: 2018 Jacksonville ISO C++ Committee Reddit Trip Report : cpp

Of course, we’re waiting for some more significant features like modules, concepts, ranges, networking, co-routines. The good news is that we can expect most of them… or core parts to be in C++20. So let’s wait, and I keep finger crossed for the committee: as they have to do a lot of work to “assemble” those delicate pieces together.

Tools

In the second part, I did a demo of tools that I use or recently experimented.

On a daily basis, I work in Visual Studio, and I am pleased to see how the platform evolves. One point is, of course, keeping up with the standardization of the language. While moving to C++11 was a big problem for VS in the past, now the pace is amazing. The most blockers in the compiler were, as far as I know, rewritten and implementing C++17 is very close to being finished. VS 2017 was released in March 2017, and so far we had like six releases with useful updates.
And we can expect more good stuff in 2018: see this roadmap for VS.

Some great additions in VS:

  • Open Folder
  • Cmake support – 15.4 - so I don’t have to run Cmake to get a VS solution manually!
  • Clang compiler in VS!
  • Google and Boost Tests adapters since 15.5!

The next big thing is Clang and the tools that are built on top of Clang tooling. You can use Clang main tools like:

  • Format
  • Tidy
  • Analyzer

I especially like to use Clang Power Tools that are provided for Visual Studio.

But we have more products that are based on Clang:

Also, recently I got a chance to play with some unique products:

Conan package manager

Conan

Conan looks like a fantastic package manager for C++. I posted some more thoughts on that in my recent post: pimpl vs Abstract Interface - a practical tutorial.

And:

Live++ Molecular Matters

Live++ - Molecular Matters

Live++ is a mind-blowing tool! In a matter of seconds you can recompile your code changes and immediately hot-patch the running binaries! It's just one DLL that you need to load at the start of your app, and then you have access to this amazing feature. Very useful for testing and prototyping.

Live++ was released publicly on 27th March, and I got a chance to be a beta tester a few months earlier :)

The Slides

Summary

Of course, there are many more amazing tools that we can use for C++ today. During the presentation, I scratched only the surface of the topic.

What are your favourite tools for C++?
Do you agree with my opinion that currently for C++ we have quite decent tools?

Refactoring with C++17 std::optional

$
0
0

Refactoring with C++17 std::optional

There are many situations where you need to express that something is “optional” - an object that might contain a value or not. You have several options to implement such case, but with C++17 there’s probably the most helpful way: std::optional.

For today I’ve prepared one refactoring case where you can learn how to apply this new C++17 feature.

Intro

Let’s dive into the code quickly.

There’s a function that takes ObjSelection representing for example current mouse selection. The function scans the selection and finds out the number of animating objects, if there any civil units and if there are any combat units.

The existing code looks like that:

classObjSelection
{
public:
boolIsValid()const{returntrue;}
// more code...
};

boolCheckSelectionVer1(constObjSelection&objList,
bool*pOutAnyCivilUnits,
bool*pOutAnyCombatUnits,
int*pOutNumAnimating);

As you can see above, there are mostly output parameters (in the form of raw pointers), and the function returns true/false to indicate success (for example the input selection might be invalid).

I’ll skip the implementation for now, but here’s an example code that calls this function:

ObjSelection sel;

bool anyCivilUnits {false};
bool anyCombatUnits {false};
int numAnimating {0};
if(CheckSelectionVer1(sel,&anyCivilUnits,&anyCombatUnits,&numAnimating))
{
// ...
}

Why is this function not perfect?

There might be several things:

  • Look at the caller’s code: we have to create all the variables that will hold the outputs. For sure it looks like a code duplication if you call the function is many places.
  • Output parameters: Core guidelines suggests not to use them.
  • If you have raw pointers you have to check if they are valid.
  • What about extending the function? What if you need to add another output param?

Anything else?

How would you refactor this?

Motivated by Core Guidelines and new C++17 features, I plan to use the following refactoring steps:

  1. Refactor output parameters into a tuple that will be returned.
  2. Refactor tuple into a separate struct and reduce the tuple to a pair.
  3. Use std::optional to express possible errors.

Let’s start!

Tuple

The first step is to convert the output parameters into a tuple and return it from the function.

According to F.21: To return multiple “out” values, prefer returning a tuple or struct:

A return value is self-documenting as an “output-only” value. Note that C++ does have multiple return values, by convention of using a tuple (including pair), possibly with the extra convenience of tie at the call site.

After the change the code might look like this:

std::tuple<bool,bool,bool,int>
CheckSelectionVer2(constObjSelection&objList)
{
if(!objList.IsValid())
return{false,false,false,0};

// local variables:
int numCivilUnits =0;
int numCombat =0;
int numAnimating =0;

// scan...

return{true, numCivilUnits >0, numCombat >0, numAnimating };
}

A bit better… isn’t it?

  • No need to check raw pointers
  • Code is quite expressive

What’s more on the caller site, you can use Structured Bindings to wrap the returned tuple:

auto[ok, anyCivil, anyCombat, numAnim]=CheckSelectionVer2(sel);
if(ok)
{
// ...
}

Unfortunately, I don’t see this version as the best one. I think that it’s easy to forget the order of outputs from the tuple. There was even an article on that at SimplifyC++: Smelly std::pair and std::tuple.

What’s more, the problem of function extensions is still present. So when you’d like to add another output value, you have to extend this tuple and the caller site.

That’s why I propose another step: a structure (as it’s also suggested by Core Guidelines).

A separate structure

The outputs seem to represent related data. That’s why it’s probably a good idea to wrap them into a struct called SelectionData.

structSelectionData
{
bool anyCivilUnits {false};
bool anyCombatUnits {false};
int numAnimating {0};
};

And then you can rewrite the function into:

std::pair<bool,SelectionData>CheckSelectionVer3(constObjSelection&objList)
{
SelectionData out;

if(!objList.IsValid())
return{false, out};

// scan...

return{true, out};
}

And the caller site:

if(auto[ok, selData]=CheckSelectionVer3(sel); ok)
{
// ...
}

I’ve used std::pair so we still preserve the success flag, it’s not the part of the new struct.

The main advantage that we got here is that the code is the logical structure and extensibility. If you want to add a new parameter then just extend the structure.

But isn’t std::pair<bool, MyType> not similar to std::optional?

std::optional

From cppreference - std::optional:

The class template std::optional manages an optional contained value, i.e. a value that may or may not be present.
A common use case for optional is the return value of a function that may fail. As opposed to other approaches, such as std::pair<T,bool>, optional handles expensive-to-construct objects well and is more readable, as the intent is expressed explicitly.

That seems to be the perfect choice for out code. We can remove ok and rely on the semantics of the optional.

Just for the reference std::optional was added in C++17 (see my description), but before you could also leverage boost::optional as they are mostly the same type.

The new version of the code:

std::optional<SelectionData>CheckSelection(constObjSelection&objList)
{
if(!objList.IsValid())
return{};

SelectionData out;

// scan...

return{out};
}

And the caller site:

if(auto ret =CheckSelection(sel); ret.has_value())
{
// access via *ret or even ret->
// ret->numAnimating
}

What are the advantages of the optional version?

  • Clean and expressive form
  • Efficient: Implementations of optional are not permitted to use additional storage, such as dynamic memory, to allocate its contained value. The contained value shall be allocated in a region of the optional storage suitably aligned for the type T.
    • Don’t worry about extra memory allocations.

The `optional` version looks best to me.

Sorry for a little interruption in the flow :)
I've prepared a little bonus if you're interested in C++17, check it out here:

The code

You can play with the code below, compile and experiment:

Wrap up

In this post, you’ve seen how to refactor lots of ugly-looking output parameters to a nicer std::optional version. The optional wrapper clearly expresses that the computed value might be not present. Also, I’ve shown how to wrap several function parameters into a separate struct. Having one separate type lets you easily extend the code while keeping the logical structure at the same time.

How would you refactor the first version of the code?
Do you return tuples or try to create structs from them?

Here’s some more articles that helped me with this post:

Using C++17 std::optional

$
0
0

Using std::optional, C++

Let’s take a pair of two types <YourType, bool> - what can you do with such composition?

In this article, I’ll describe std:optional - a new helper type added in C++17. It’s a wrapper for your type and a flag that indicates if the value is initialized or not. Let’s see where it can be useful and how you can use it.

Intro

By adding the boolean flag to other types, you can achieve a thing called “nullable types”. As mentioned, the flag is used to indicate whether the value is available or not. Such wrapper represents an object that might be empty in an expressive way (so not via comments :))

While you can achieve “null-ability” by using unique values (-1, infinity, nullptr), it’s not as clear as the separate wrapper type. Alternatively, you could even use std::unique_ptr<Type> and treat the empty pointer as not initialized - this works, but comes with the cost of allocating memory for the object.

Optional types - that come from functional programming world - bring type safety and expressiveness. Most of other languages have something similar: for example std::option in Rust, Optional<T> in Java, Data.Maybe in Haskell.

std::optional was added in C++17 and brings a lot of experience from boost::optional that was available for many years. Since C++17 you can just #include <optional> and use the type.

Such wrapper is still a value type (so you can copy it, via deep copy). What’s more, std::optional doesn’t need to allocate any memory on the free store.

std::optional is a part of C++ vocabulary types along with std::any, std::variant and std::string_view.

When to use

Usually, you can use an optional wrapper in the following scenarios:

  • If you want to represent a nullable type nicely.
    • Rather than using unique values (like -1, nullptr, NO_VALUE or something)
    • For example, user’s middle name is optional. You could assume that an empty string would work here, but knowing if a user entered something or not might be important. With std::optional<std::string> you get more information.
  • Return a result of some computation (processing) that fails to produce a value and is not an error.
    • For example finding an element in a dictionary: if there’s no element under a key it’s not an error, but we need to handle the situation.
  • To perform lazy-loading of resources.
    • For example, a resource type has no default constructor, and the construction is substantial. So you can define it as std::optional<Resource> (and you can pass it around the system), and then load only if needed later.

I like the description from boost optional which summarizes when we should use the type:

From the boost::optional documentation: When to use Optional

It is recommended to use optional<T> in situations where there is exactly one, clear (to all parties) reason for having no value of type T, and where the lack of value is as natural as having any regular value of T

While sometimes the decision to use optional might be blurry, you shouldn’t use it for error handling. As it best suits the cases when the value is empty and it’s a normal state of the program.

Basic Example

Here’s a simple example of what you can do with optional:

std::optional<std::string> UI::FindUserNick()
{
if(nick_available)
return{ mStrNickName };

return std::nullopt;// same as return { };
}

// use:
std
::optional<std::string>UserNick= UI->FindUserNick();
if(UserNick)
Show(*UserNick);

In the above code we define a function that returns optional containing a string. If the user’s nickname is available, then it will return a string. If not, then it returns nullopt. Later we can assign it to an optional and check (it converts to bool) if it contains any value or not. Optional defines operator* so we can easily access the contained value.

In the following sections you’ll see how to createstd::optional, operate on it, pass around and even what is the performance cost you might want to consider.

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

  • Refactoring with std::optional
  • Using std::optional(this post)
  • Error handling and std::optional
  • Using std::variant
  • Using std::any
  • In place construction for std::optional, std::variant and std::any
  • Using std::string_view
  • C++17 string searchers & conversion utilities
  • Working with std::filesystem
  • Something more? :)

Resources about C++17 STL:

OK, so let’s move to std::optional.

std::optional Creation

There are several ways to create std::optional:

// empty:
std
::optional<int> oEmpty;
std
::optional<float> oFloat = std::nullopt;

// direct:
std
::optional<int> oInt(10);
std
::optional oIntDeduced(10);// deduction quides

// make_optional
auto oDouble = std::make_optional(3.0);
auto oComplex = make_optional<std::complex<double>>(3.0,4.0);

// in_place
std
::optional<std::complex<double>> o7{std::in_place,3.0,4.0};

// will call vector with direct init of {1, 2, 3}
std
::optional<std::vector<int>> oVec(std::in_place,{1,2,3});

// copy/assign:
auto oIntCopy = oInt;

As you can see in the above code sample, you have a lot of flexibility with the creation of optional. It’s very simple for primitive types and this simplicity is extended for even complex types.

The “in_place” construction is especially interesting, and the tag std::in_place is also supported in other types like any and variant.

For example, you can write:

// https://godbolt.org/g/FPBSak
structPoint
{
Point(int a,int b): x(a), y(b){}

int x;
int y;
};

std
::optional<Point> opt{std::in_place,0,1};
// vs
std
::optional<Point> opt{{0,1}};

This saves the creation of a temporary Point object.

I’ll address std::in_place later in a separate post, so stay tuned.

Returning std::optional

If you return an optional from a function, then it’s very convenient to return just std::nullopt or the computed value.

std::optional<std::string>TryParse(Input input)
{
if(input.valid())
return input.asString();

return std::nullopt;
}

In the above example you can see that I return std::string computed from input.asString() and it’s wrapped in optional. If the value is unavailable then you can just return std::nullopt.

Of course, you can also declare an empty optional at the beginning of your function and reassign if you have the computed value. So we could rewrite the above example as:

std::optional<std::string>TryParse(Input input)
{
std
::optional<std::string> oOut;// empty

if(input.valid())
oOut
= input.asString();

return oOut;
}

It probably depends on the context which version is better. I prefer short functions, so I’d chose the first option (with multiple returns).

Accessing The Stored Value

Probably the most important operation for optional (apart from creation) is the way how you can fetch the contained value.

There are several options:

  • operator* and operator->- similar to iterators. If there’s no value the behaviour is undefined!
  • value() - returns the value, or throws std::bad_optional_access
  • value_or(defaultVal) - returns the value if available, or defaultVal otherwise.

To check if the value is present you can use has_value() method or just check if (optional) as optional is automatically converted to bool.

Here’s an example:

// by operator*
std
::optional<int> oint =10;
std
::cout<<"oint "<<*opt1 <<'\n';

// by value()
std
::optional<std::string> ostr("hello");
try
{
std
::cout <<"ostr "<< ostr.value()<<'\n';
}
catch(const std::bad_optional_access& e)
{
std
::cout << e.what()<<"\n";
}

// by value_or()
std
::optional<double> odouble;// empty
std
::cout<<"odouble "<< odouble.value_or(10.0)<<'\n';

So the most useful way is probably just to check if the value is there and then access it:

// compute string function:
std
::optional<std::string> maybe_create_hello();
// ...

if(auto ostr = maybe_create_hello(); ostr)
std
::cout <<"ostr "<<*ostr <<'\n';
else
std
::cout <<"ostr is null\n";

std::optional Operations

Let’s see what are other operations on the type:

Changing the value

If you have existing optional object, then you can easily change the contained value by using several operations like emplace, reset, swap, assign. If you assign (or reset) with a nullopt then if the optional contains a value its destructor will be called.

Here’s a little summary:

#include<optional>
#include<iostream>
#include<string>

classUserName
{
public:
explicitUserName(const std::string& str): mName(str)
{
std
::cout <<"UserName::UserName(\'";
std
::cout << mName <<"\')\n";
}
~UserName()
{
std
::cout <<"UserName::~UserName(\'";
std
::cout << mName <<"\')\n";
}

private:
std
::string mName;
};

int main()
{
std
::optional<UserName> oEmpty;

// emplace:
oEmpty
.emplace("Steve");

// calls ~Steve and creates new Mark:
oEmpty
.emplace("Mark");


// reset so it's empty again
oEmpty
.reset();// calls ~Mark
// same as:
//oEmpty = std::nullopt;

// assign a new value:
oEmpty
.emplace("Fred");
oEmpty
=UserName("Joe");
}

The code is available here: @Coliru

Comparisons

std::optional allows you to compare contained objects almost “normally”, but with a few exceptions when the operands are nullopt. See below:

#include<optional>
#include<iostream>

int main()
{
std
::optional<int> oEmpty;
std
::optional<int> oTwo(2);
std
::optional<int> oTen(10);

std
::cout << std::boolalpha;
std
::cout <<(oTen > oTwo)<<"\n";
std
::cout <<(oTen < oTwo)<<"\n";
std
::cout <<(oEmpty < oTwo)<<"\n";
std
::cout <<(oEmpty == std::nullopt)<<"\n";
std
::cout <<(oTen ==10)<<"\n";
}

The above code generates:

true// (oTen > oTwo)
false// (oTen < oTwo)
true// (oEmpty < oTwo)
true// (oEmpty == std::nullopt)
true// (oTen == 10)

The code is available here: @Coliru

Examples of std::optional

Here are two a few longer examples where std::optional fits nicely.

User name with an optional nickname and age

#include<optional>
#include<iostream>

classUserRecord
{
public:
UserRecord(const std::string& name, std::optional<std::string> nick, std::optional<int> age)
: mName{name}, mNick{nick}, mAge{age}
{
}

friend std::ostream&operator<<(std::ostream& stream,constUserRecord& user);

private:
std
::string mName;
std
::optional<std::string> mNick;
std
::optional<int> mAge;

};

std
::ostream&operator<<(std::ostream& os,constUserRecord& user)
{
os
<< user.mName <<'';
if(user.mNick){
os
<<*user.mNick <<'';
}
if(user.mAge)
os
<<"age of "<<*user.mAge;

return os;
}

int main()
{
UserRecord tim {"Tim","SuperTim",16};
UserRecord nano {"Nathan", std::nullopt, std::nullopt };

std
::cout << tim <<"\n";
std
::cout << nano <<"\n";
}

The code is available here: @Coliru

Parsing ints from the command line

#include<optional>
#include<iostream>
#include<string>

std
::optional<int>ParseInt(char*arg)
{
try
{
return{ std::stoi(std::string(arg))};
}
catch(...)
{
std
::cout <<"cannot convert \'"<< arg <<"\' to int!\n";
}

return{};
}

int main(int argc,char* argv[])
{
if(argc >=3)
{
auto oFirst =ParseInt(argv[1]);
auto oSecond =ParseInt(argv[2]);

if(oFirst && oSecond)
{
std
::cout <<"sum of "<<*oFirst <<" and "<<*oSecond;
std
::cout <<" is "<<*oFirst +*oSecond <<"\n";
}
}
}

The code is available here: @Coliru

The above code uses optional to indicate if we performed the conversion or not. Note that we in fact converted exceptions handling into optional, so we skip the errors that might appear. This might be “controversial” as usually, we should report errors.

Other examples

  • Representing other optional entries for your types. Like in the example of a user record. It’s better to write std::optonal<Key> rather than use a comment to make notes like // if the 'key is 0x7788 then it's empty or something :)
  • Return values for Find*() functions (assuming you don’t care about errors, like connection drops, database errors or something)

Performance & Memory consideration

When you use std::optional you’ll pay with increased memory footprint. At least one extra byte is needed.

Conceptually your version of the standard library might implement optional as:

template<typename T>
class optional
{
bool _initialized;
std
::aligned_storage_t<sizeof(t),alignof(T)> _storage;

public:
// operations
};

In short optional just wraps your type, prepares a space for it and then adds one boolean parameter. This means it will extend the size of your Type according do the alignment rules.

Alignment rules are important as The standard defines:

Class template optional [optional.optional]:
The contained value shall be allocated in a region of the optional storage suitably aligned for the type T.

For example:

// sizeof(double) = 8
// sizeof(int) = 4
std
::optional<double> od;// sizeof = 16 bytes
std
::optional<int> oi;// sizeof = 8 bytes

While bool type usually takes only one byte, the optional type need to obey the alignment rules and thus the whole wrapper is larger than just sizeof(YourType) + 1 byte.

For example, if you have a type like:

structRange
{
std
::optional<double> mMin;
std
::optional<double> mMax;
};

it will take more space than when you use your custom type:

structRange
{
bool mMinAvailable;
bool mMaxAvailable;
double mMin;
double mMax;
};

In the first case, we’re using 32 bytes! The second version is 24 bytes.

Test code using Compiler Explorer

Here’s a great description about the performance and memory layout taken from boost documentation: Performance considerations - 1.67.0.

And in Efficient optional values | Andrzej’s C++ blog the author discusses how to write a custom optional wrapper that might be a bit faster

I wonder if there’s a chance to do some compiler magic and reuse some space and fit this extra “initialized flag” inside the wrapped type. So no extra space would be needed.

Migration from boost::optional

std::optional was adapted directly from boost::optional, so you should see the same experience in both versions. Moving from one to another should be easy, but of course, there are little differences.

In the paper: N3793 - A proposal to add a utility class to represent optional objects (Revision 4) - from 2013-10-03 I’ve found the following table (and I tried to correct it when possible)

Warning! This table might not be 100% correct with the current versions!
Let me know if you see any other differences, I’ll try to include it into the table.

aspectstd::optionalboost::optional (as of 1.67.0)
Move semanticsyesno yes in current boost
noexceptyesno yes in current boost
hash supportyesno
a throwing value accessoryesno
literal typepartiallyno
in place construction`emplace`, tag `in_place`utility in_place_factory
disengaged state tagnulloptnone
optional referencesno (optionally)yes
conversion from optional<U> to optional<T>noyes
duplicated interface functions (is_initialized, reset, get)noyes
explicit convert to ptr (get_ptr)noyes

Special case: optional<bool> and optional<T*>

While you can use optional on any type you need to pay special attention when trying to wrap boolean or pointers.

std::optional<bool> ob - what does it model? With such construction you basically have a tri-state bool. So if you really need it, then maybe it’s better to look for a real tri-state bool like boost::tribool.

Whet’s more it might be confusing to use such type because ob converts to bool if there’s a value inside and *ob returns that stored value (if available).

Similarly you have a similar confusion with pointers:

// don't use like that! only an example!
std
::optional<int*> opi {newint(10)};
if(opi &&*opi)
{
std
::cout <<**opi << std::endl;
delete*opi;
}
if(opi)
std
::cout <<"opi is still not empty!";

The pointer to int is naturally “nullable”, so wrapping it into optional makes it very hard to use.

Wrap up

Uff… ! it was a lot of text about optional, but still it’s not all :)

Yet, we’ve covered the basic usage, creation and operations of this useful wrapper type. I believe we have a lot of cases where optional fits perfectly and much better than using some predefined values to represent nullable types.

I’d like to remember the following things about std::optional:

  • std::optional is a wrapper type to express “null-able” types.
  • std::optional won’t use any dynamic allocation
  • std::optional contains a value or it’s empty
    • use operator *, operator->, value() or value_or() to access the underlying value.
  • std::optional is implicitly converted to bool so that you can easily check if it contains a value or not.

In the next article I’ll try to explain error handling and why optional is maybe not the best choice there.

I’d like to thank Patrice Roy (@PatriceRoy1) and Jacek Galowicz (@jgalowicz) for finding time do do a quick review of this article!

C++ Templates - The Complete Guide 2nd Book Review

$
0
0

C++ Templates - The Complete Guide 2nd Book review

A few months ago I received a quite massive mail package with something that was looking like a brand new C++ book :)

My initial plan was to review it quickly, maybe in one month. But I failed, as learning C++ templates is not that easy :) I needed much more time.

Time passed by and now I am ready for the review, so here you have it :) See my thoughts about the fantastic book on C++ templates, “the templates book” as many people call it.

Note: I got this book from the authors, but the review is not sponsored in any other form.

The Book

C++ Templates - The Complete Guide 2nd Book review

C++ Templates: The Complete Guide (2nd Edition)
by David Vandevoorde, Nicolai M. Josuttis and Douglas Gregor

The main book’s site: www.tmplbook.com.

I own the printed copy, and it looks impressive:

C++ Templates 2nd edition

The Structure

The book consists of 822 pages packed into 33 chapters!

There are three main parts:

  1. Basics
  2. Technical Details
  3. Templates and Design

Here’s the summary of the contents:

  • Basics
    • Function Templates
    • Class Templates
    • Nontype Template Parameters
    • Variadic Templates
    • Keyword typename, Zero Initialization, Templates for Raw Array and String Literals
    • Variable Templates and Template Template Parameters
    • Move Semantics and enable_if<>
    • Template Template Parameters
    • By Value or By Reference?
    • Compile-Time Programming
    • Using Templates in Practice
    • Template Terminology
    • Generic Libraries

This section should be probably read by every C++ programmer, as it discusses the underlying principles for templates: how they work and when we can use them. We go from simple function templates like

template<typename T>
T max
(T a, T b){...}

And once the authors introduced background vocabulary and theory, we can move to class templates like:

template<typename T>
classStack{...};

The whole part adds more and more advanced topics, and it’s written in a tutorial style.

  • Technical Details
    • Declarations, Arguments, and Parameters
    • Names and Parsing
    • Instantiation
    • Argument Deduction
    • Specialization and Overloading
    • Future Directions

In the second part, we dive into very advanced topics, and the book becomes more a reference style. You can read it all, or focus on the sections that you need.
In the “Future Directions” chapter there are topics related to upcoming C++ highlights like modules, concepts.

  • Templates and Design
    • Static Polymorphism
    • Traits and Policy Classes
    • Type Overloading
    • Templates and Inheritance
    • std::function<>
    • Metaprogramming
    • Typelists, Tuples, and Discriminated Unions
    • Expression Templates
    • Debugging Templates

After you have the basics and then you can jump into programming techniques related to templates. The “traits” chapters are especially handy as by learning how they are implemented you can efficiently learn templates.
There’s also the “Debugging” chapter so you can learn techniques to make your life easier when the compiler reports several pages of compiler errors with templates :)

My View

This is a massive book!

I need to be honest with you: I still haven’t finished reading it (and it’s almost five months since I started). Such delay is, however, a very positive feature of the book because it’s not a “read-it-over-the-weekend” book. It’s filled with solid material and, let’s be clear, usually complicated stuff.

Probably the essential feature of this books is the relevance and that it is based on modern C++ - thus we have techniques from C++11, C++14 and of course C++17. What’s more, there are even topics about upcoming features, so you’ll be prepared for the future. The authors are ISO members with a vast amount of experience in C++, so you can be sure you get a very comprehensive material.

The first part - the basics - is written, as mentioned, in a tutorial style, so you can just read it from the first chapter to the last and gradually learn more and more. It starts with the basic samples and ends with complex cases. A more advanced code sample is for example how to implement call that invokes a callable object and forward all the input arguments to this object. Of course with variadic templates and auto return type.

Then we have the third section - with so many real programming examples of how we can use templates.

For example one month ago I was on a local C++ User Group Krakow meeting (link here) and there was a great live coding by Tomasz Kaminski about implementing tuples. I think that if you know how to implement tuples, then you’re quite a template expert :) Here, in the book, you have a separate chapter on the topic of tuples. I could read it and slowly try to understand what’s going on.

Summary

Final mark: 5/5 + Epic Badge! :)

An epic book that will fill a lot of your time and will leave you with solid knowledge about modern C++ templates (including C++11, C++14 and C++17… and even some insight about upcoming things in C++20). What can I say more? :)

What’s more, I can add that the link to the book was posted at r/cpp and it wasn’t downvoted. In one comment someone said that this book (also the first version) is considered as “the template book”

See the full thread at r/cpp/tmplbook2

You can also see a good presentation by N. Josuttis (one of the authors) that happened in the recent ACCU 2018, where Nicolai talks about how the book was written (and a bit about the first edition):

To sum thing up: if you want to learn templates, here is the book for you :)

Let me know what do you think about it.

  • Have you seen it already?
  • What other resources do you use to learn about C++ templates?

Error Handling and std::optional

$
0
0

Error handling and std::optional

In my last two posts in the C++17 STL series, I covered how to use std::optional. This wrapper type (also called “vocabulary type”) is handy when you’d like to express that something is ‘nullable’ and might be ‘empty’. For example, you can return std::nullopt to indicate that the code generated an error… but it this the best choice?

What’s the problem

Let’s see an example:

structSelectionData
{
bool anyCivilUnits {false};
bool anyCombatUnits {false};
int numAnimating {0};
};

std
::optional<SelectionData>
CheckSelection(constObjSelection&objList)
{
if(!objList.IsValid())
return{};

SelectionData out;

// scan...

return{out};
}

This code comes from my older post about refactoring with std::optional.

The basic idea is that if the selection is valid, you can perform a scan and look for “civil units”, “combat units” or a number of animating objects. Once the scan is complete, we can build an object SelectionData and wrap it with std::optional. If the selection is not ready, then we return nullopt - empty optional.

While the code looks nice, you might ask one question: what about error handling?

The problem with std::optional is that we lose information about errors. The function returns a value or something empty, so you cannot tell what went wrong. In the case of this function, we only had one way to exit earlier - if the selection is not valid. But in a more complicated example, there might be a few reasons.

What do you think? Is this a legitimate use of std::optional?

Let’s try to find the answer.

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

  • Refactoring with std::optional
  • Using std::optional
  • Error handling and std::optional(this post)
  • Using std::variant
  • Using std::any
  • In place construction for std::optional, std::variant and std::any
  • Using std::string_view
  • C++17 string searchers & conversion utilities
  • Working with std::filesystem
  • Something more? :)

Resources about C++17 STL:

Error handling

As you might already know there are a lot of ways to handle errors. And what’s even more complicated is that we have different kinds of errors.

In C++, we can do two things:

  • use some error code / special value
  • throw an exception

of course with a few variations:

  • return some error code and return a computed value as an output parameter
  • return a unique value for the computed result to indicate an error (like -1, npos)
  • throw an exception - since exceptions are considered “heavy” and add some overhead a lot of projects use them sparingly.
    • plus we have to make a decision what to throw
  • return a pair <value, error_code>
  • return a variant/discriminated union <value, error>
  • set some special global error object (like errno for fopen) - often in C style API
  • others… ?

In a few papers and articles I’ve seen a nice term “disappointment” that relate to all kind of errors and “problems” that code might generate.

We might have a few types of disappointments:

  • System/OS
  • Serious
  • Major
  • Normal
  • Minor
  • Expected / probable.

Furthermore, we can see the error handling in terms of performance. We’d like it to be fast and using some additional machinery to facilitate errors might not be an option (like in the embedded world). Thus, for example, exceptions are considered “heavy” and usually not used in low-level code.

Where does std::optional fit?

I think, with std::optional we simply got another tool that can enhance the code.

std::optional Version

As I noted several times, std::optional should be mainly used in the context of nullable types.

From the boost::optional documentation: When to use Optional

It is recommended to use optional<T> in situations where there is exactly one, clear (to all parties) reason for having no value of type T, and where the lack of value is as natural as having any regular value of T.

I can also argue that since optional adds a “null” value to our type, it’s close to using pointers and nullptr. For example, I’ve seen a lot of code where a valid pointer was returned in the case of the success and nullptr in the case of an error.

TreeNode*FindNode(TheTree* pTree, string_view key)
{
// find...
if(found)
return pNode;

returnnullptr;
}

Or if we go to some C-level functions:

FILE* pFile =nullptr;
pFile
= fopen ("temp.txt","w");
if(pFile != NULL)
{
fputs
("fopen example",pFile);
fclose
(pFile);
}

And even in C++ STL we return npos in the case of failed string searches. So rather than nullptr it uses a special value to indicate an error (maybe not a failure but a probable situation that we failed to find something).

std::string s ="test";
if(s.find('a')== std::string::npos)
std
::cout <<"no 'a' in 'test'\n";

I think that in the above example - with npos, we could safely rewrite it to optional. And every time you have a function that computes something and the result might be empty - then std::optional is a way to go.

When another developer sees a declaration like:

std::optional<Object>PrepareData(inputs...);

It’s clear that Object might sometimes not be computed and it’s much better than

// returns nullptr if failed! check for that!
Object*PrepareData(inputs...);

While the version with optional might look nicer, the error handling is still quite “weak”.

How about other ways?

Alternatively, if you’d like to transfer more information about the ‘disappointments’ you can think about std::variant<Result, Error_Code> or a new proposal Expected<T, E> that wraps the expected value with an error code. At the caller site, you can examine the reason for the failure:

// imaginary example for std::expected
std
::expected<Object, error_code>PrepareData(inputs...);

// call:
auto data =PrepareData(...);
if(data)
use
(*data);
else
showError
(data.error());

When you have optional, then you have to check if the value is there or not. I like the functional style ideas from Simon Brand where you can change code like:

std::optional<image_view> get_cute_cat (image_view img){
auto cropped = find_cat(img);
if(!cropped){
return std::nullopt;
}

auto with_sparkles = make_eyes_sparkle(*with_tie);
if(!with_sparkles){
return std::nullopt;
}

return add_rainbow(make_smaller(*with_sparkles));
}

Into:

tl::optional<image_view> get_cute_cat (image_view img){
return find_cat(img)
.and_then(make_eyes_sparkle)
.map(make_smaller)
.map(add_rainbow);
}

More in his post: Functional exceptionless error-handling with optional and expected

New proposal

When I was writing the article Herb Sutter published a brand new paper on a similar topic:

PDF P0709 R0 - Zero - overhead deterministic exceptions: Throwing values.

It will be discussed in the next C++ ISO Meeting in Rapperswil at the beginning of June.

Herb Sutter discusses what the current options for error handling are, what are their pros and cons. But the main things is the proposal of throws a new version of exception handling mechanism.

This proposal aims to marry the best of exceptions and error codes: to allow a function to declare that it
throws values of a statically known type, which can then be implemented exactly as efficiently as a return value.
Throwing such values behaves as if the function returned union{R;E;}+bool where on success the function returns the normal return value R and on err or the function returns the error value type E, both in the same return channel including using the same registers. The discriminant can use an unused CPU flag or a register.

For example:

string func() throws // new keyword! not "throw"
{
if(flip_a_coin())throw
arithmetic_error
::something;

returnxyzzys +plover”;// any dynamic exception
// is translated to error
}

int main(){
try{
auto result = func();
cout
<<success, result is:<< result;
}
catch(error err){// catch by value is fine
cout
<<failed, error is:<< err.error();
}
}

In general, the proposal aims for having an exception-style syntax, while keeping the zero-overhead and type safety.

Consistency & Simplicity

I believe that while we have a lot of options and variations on error handling the key here is “the consistency“.

If you have a single project that uses 10 ways of error handling it might be hard to write new parts as programmers will be confused what to use.

It’s probably not possible to stick to the single version: in some critical performance code exceptions are not an option, or even wrapper types (like optional, variant, expected) are adding some overhead. Keeping the minimum of the right tools is the ideal path.

Another thought on this matter is how your code is clear and straightforward. Because if you have relatively short functions that do only one thing, then it’s easy to represent disappointments - as there are just a few options. But if your method is long, with a few responsibilities, then you might get a whole new complexity of errors.

Keeping code simple will help the caller to handle the outcome in a clear meaner.

Sorry for a little interruption in the flow :)
I've prepared a little bonus if you're interested in C++17, check it out here:

Wrap up

In this article, I reviewed some of the options to handle errors (or disappointments) in our C++ code. We even looked at the future when I mentioned new Herb Sutter’s proposal about “Zero-overhead deterministic exceptions”.

Where does std::optional fit?

It allows you to express nullable types. So if you have a code that returns some special value to indicate the result of the computation failure, then you can think about wrapping it with optional. The key thing is that optional doesn’t convey the reason for the failure, so you still have to use some other mechanisms.

With optional you have a new tool to express your ideas. And the key here, as always, is to be consistent and write simple code, so it doesn’t bring confusion to other developers.

What’s your opinion about using optional for error handling?
Do you use it that way in your code?

See previous post in the series: Using C++17 std::optional

Here are some other articles that might help:

And also here a presentation from Meeting C++ 2017 about std::expected:

Show me your code: std::optional

$
0
0

std::optional contest

Show me your code!

I’d like to run a little experiment.

Let’s build a wall of examples of std::optional!

Intro

In the last three articles of my C++17 STL series I’ve been discussing how to use std::optional. I can talk and talk… or write and write… but I’m wondering how do you use this wrapper type?

That’s why I prepared a little experiment and a giveaway:

The rules

It’s all about you (short) examples ofstd::optional
Later, I plan to compose a new blog post with all of the submissions.

  • Send me a link to gist/coliru/compiler explorer… etc - with a short example of std::optional.
    • You can add a link in the comments below or send me an email
      • bartlomiej DOT filipek AT bfilipek DOT com
    • Please mention if you allow showing your Name next to the example
    • This submission is one-time only so that I won’t add you to my email list automatically. However, if you’d like to stay updated about the results and future posts, then please subscribe.
  • Ideally the max number of lines is 25 (not taking into account main() or the caller’s code).
    • Feel free to submit the code if it’s longer, we’ll think how to make it more compact.
  • Add description what the code does.
  • The code should represent some “real-life” use.
  • The code cannot, of course, violate any copyright rules.
  • I’ll select most useful examples and compose a single post about optional examples
  • You can submit only one code sample.

Usually std::optional is used in:

  • To return something from a function
  • As an optional input parameter to a function
  • As an optional class member
  • To perform some lazy loading/two-phase init of some object

So probably your code will be one of those three above variations... but of course you might came up with something different.

Dates:
It starts now! (28th May)
Ends 7th June (8:00 am GMT+2 Time, Poland) (so I can prepare a post that will be published on 11th June)

For a start here’s a Coliru link with some basic sample:
Coliru sample std::optional code

For example this my source code that I’ve shared in some previous posts:

structSelectionData
{
bool anyCivilUnits {false};
bool anyCombatUnits {false};
int numAnimating {0};
};

std
::optional<SelectionData>
CheckSelection(constObjSelection&objList)
{
if(!objList.IsValid())
return{};

SelectionData out;

// scan...

return{out};
}

The gift

I have 2 x 25$ Amazon.com Gift Card.
I’ll pick two random winners from all the submissions.

Note: It’s an US gift card, so you’ll be able to use it on Amazon.com only.

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

Resources about C++17 STL:

I’m waiting for your code!

Everything You Need to Know About std::variant from C++17

$
0
0

Using std::variant in C++17

Around the time C++17 was being standardized I saw magical terms like “discriminated union”, “type-safe union” or “sum type” floating around. Later it appeared to mean the same type: “variant”.

Let’s see how this brand new std::variant from C++17 works and where it might be useful.

The Basics

In my experience, I haven’t used unions much. But when I did, it was mostly some low-level stuff.

For example for floating point optimization:

unionSuperFloat
{
float f;
int i;
}

intRawMantissa(SuperFloat f)
{
return f.i &((1<<23)-1);
}
intRawExponent(SuperFloat f)
{
return(f.i >>23)&0xFF;
}

Or a convenient access to Vector3/Vector4 types:

class VECTOR3D
{
public:
// operations, etc...

union
{
float m[3];

struct
{
float x, y, z;
};
};
};

VECTOR3D v
;
// same effect
v
.m[0]=1.0f;
v
.x =1.0f;

As you can see those are useful, but quite a low-level usage, even C-style.

But what if you wanted to use unions more “high level”?

The problem with unions is that they’re very simple and crude. You don’t have a way to know what’s the currently used type and what’s more they won’t call destructors of the underlying types. Here’s an example from cppreference/union that clearly illustrate how hard it might be:

#include<iostream>
#include<string>
#include<vector>

union S
{
std
::string str;
std
::vector<int> vec;
~S(){}// what to delete here?
};

int main()
{
S s
={"Hello, world"};
// at this point, reading from s.vec is undefined behavior
std
::cout <<"s.str = "<< s.str <<'\n';

// you have to call destructor of the contained objects!
s
.str.~basic_string<char>();

// and a constructor!
new(&s.vec) std::vector<int>;

// now, s.vec is the active member of the union
s
.vec.push_back(10);
std
::cout << s.vec.size()<<'\n';

// another destructor
s
.vec.~vector<int>();
}

Play with the code @Coliru

As you see, the S union needs a lot of maintenance from your side. You have to know which type is active and adequately call destructors/constructors before switching to a new variant.

That’s the reason you probably won’t see a lot of unions that use “advanced” types such as vectors, strings, containers, etc, etc. Union is mostly for basic types.

What could make unions better?

  • the ability to use complex types
    • and the full support of their lifetime: if you switch the type then a proper destructor is called. That way we don’t leak.
  • a way to know what’s the active type

Before C++17 you could use some third-party library…. or use boost variant. But now you have std::variant.

Here’s a basic demo of what you can do with this new type:

#include<string>
#include<iostream>
#include<variant>

structSampleVisitor
{
voidoperator()(int i)const{
std
::cout <<"int: "<< i <<"\n";
}
voidoperator()(float f)const{
std
::cout <<"float: "<< f <<"\n";
}
voidoperator()(const std::string& s)const{
std
::cout <<"string: "<< s <<"\n";
}
};

int main()
{
std
::variant<int,float, std::string> intFloatString;
static_assert(std::variant_size_v<decltype(intFloatString)>==3);

// default initialized to the first alternative, should be 0
std
::visit(SampleVisitor{}, intFloatString);

// index will show the currently used 'type'
std
::cout <<"index = "<< intFloatString.index()<< std::endl;
intFloatString
=100.0f;
std
::cout <<"index = "<< intFloatString.index()<< std::endl;
intFloatString
="hello super world";
std
::cout <<"index = "<< intFloatString.index()<< std::endl;

// try with get_if:
if(constauto intPtr (std::get_if<int>(&intFloatString)); intPtr)
std
::cout <<"int!"<<*intPtr <<"\n";
elseif(constauto floatPtr (std::get_if<float>(&intFloatString)); floatPtr)
std
::cout <<"float!"<<*floatPtr <<"\n";

if(std::holds_alternative<int>(intFloatString))
std
::cout <<"the variant holds an int!\n";
elseif(std::holds_alternative<float>(intFloatString))
std
::cout <<"the variant holds a float\n";
elseif(std::holds_alternative<std::string>(intFloatString))
std
::cout <<"the variant holds a string\n";

// try/catch and bad_variant_access
try
{
auto f = std::get<float>(intFloatString);
std
::cout <<"float! "<< f <<"\n";
}
catch(std::bad_variant_access&)
{
std
::cout <<"our variant doesn't hold float at this moment...\n";
}

// visit:
std
::visit(SampleVisitor{}, intFloatString);
intFloatString
=10;
std
::visit(SampleVisitor{}, intFloatString);
intFloatString
=10.0f;
std
::visit(SampleVisitor{}, intFloatString);
}

Play with the code @Coliru

We have several things showed in the example above:

  • You know what’s the currently used type via index() or check via holds_alternative.
  • You can access the value by using get_if or get (but that might throw bad_variant_access exception)
  • Type Safety - the variant doesn’t allow to get a value of the type that’s not active
  • If you don’t initialize a variant with a value, then the variant is initialized with the first type. In that case the first alternative type must have a default constructor.
  • No extra heap allocation happens
  • You can use a visitor to invoke some action on a currently hold type.
  • The variant class calls destructors and constructors of non-trivial types, so in the example, the string object is cleaned up before we switch to new variants.

When to Use

I’d say that unless you’re doing some low-level stuff, possibly only with simple types, then unions might still be ok. But for all other uses cases, where you need variant types, std::variant is a way to go!

Some possible uses

  • All the places where you might get a few types for a single field: so things like parsing command lines, ini files, language parsers, etc, etc.
  • Expressing efficiently several possible outcomes of a computation: like finding roots of equations
  • Error handling - for example you can return variant<Object, ErrorCode>. If the value is available, then you return Object otherwise you assign some error code.
  • State machines
  • Polymorphism without vtables and inheritance (thanks to visiting pattern)

A Functional Background

It’s also worth mentioning that variant types (also called a tagged union, a discriminated union, or a sum type) comes from the functional language world and Type Theory.

After a little demo and introduction, we can now talk about some more details… so read on.

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

Resources about C++17 STL:

std::variant Creation

There are several ways you can create and initialize std::variant:

// default initialization: (type has to has a default ctor)
std
::variant<int,float> intFloat;
std
::cout << intFloat.index()<<", value "<< std::get<int>(intFloat)<<"\n";

// monostate for default initialization:

classNotSimple
{
public:
NotSimple(int,float){}
};

// std::variant<NotSimple, int> cannotInit; // error
std
::variant<std::monostate,NotSimple,int> okInit;
std
::cout << okInit.index()<<"\n";

// pass a value:
std
::variant<int,float, std::string> intFloatString {10.5f};
std
::cout << intFloatString.index()<<", value "<< std::get<float>(intFloatString)<<"\n";

// ambiguity
// double might convert to float or int, so the compiler cannot decide

//std::variant<int, float, std::string> intFloatString { 10.5 };

// ambiguity resolved by in_place
std
::variant<long,float, std::string> longFloatString { std::in_place_index<1>,7.6};// double!
std
::cout << longFloatString.index()<<", value "<< std::get<float>(longFloatString)<<"\n";

// in_place for complex types
std
::variant<std::vector<int>, std::string> vecStr { std::in_place_index<0>,{0,1,2,3}};
std
::cout << vecStr.index()<<", vector size "<< std::get<std::vector<int>>(vecStr).size()<<"\n";

// copy-initialize from other variant:
std
::variant<int,float> intFloatSecond { intFloat };
std
::cout << intFloatSecond.index()<<", value "<< std::get<int>(intFloatSecond)<<"\n";

Play with the code here @Coliru.

  • By default, a variant object is initialized with the first type,
    • if that’s not possible when the type doesn’t have a default constructor, then you’ll get a compiler error
    • you can use std::monostate to pass it as the first type in that case
  • You can initialize it with a value, and then the best matching type is used
    • if there’s an ambiguity, then you can use a version std::in_place_index to explicitly mention what type should be used.
  • std::in_place also allows you to create more complex types and pass more parameters to the constructor

About std::monostate

In the example you might notice a special type called std::monospace. It’s just an empty type that can be used with variants to represent empty state. The type might be handy when the first alternative doesn’t have a default constructor. In that situation you can place std::monostate as the first alternative.

Changing the Values

There are four ways to change the current value of the variant:

  • the assignment operator
  • emplace
  • get and then assign a new value for the currently active type
  • a visitor

The important part is to know that everything is type safe and also the object lifetime is honoured.

std::variant<int,float, std::string> intFloatString {"Hello"};

intFloatString
=10;// we're now an int

intFloatString
.emplace<2>(std::string("Hello"));// we're now string again

// std::get returns a reference, so you can change the value:
std
::get<std::string>(intFloatString)+= std::string(" World");

intFloatString
=10.1f;
if(auto pFloat = std::get_if<float>(&intFloatString); pFloat)
*pFloat *=2.0f;

See the live example @Coliru

Object Lifetime

When you use union, you need to manage the internal state: call constructors or destructors. This is error prone and easy to shoot yourself in the foot. But std::variant handles object lifetime as you expect. That means that if it’s about to change the currently stored type then a destructor of the underlying type is called.

std::variant<std::string,int> v {"Hello A Quite Long String"};
// v allocates some memory for the string
v
=10;// we call destructor for the string!
// no memory leak

Or see this example with a custom type:

classMyType
{
public:
MyType(){ std::cout <<"MyType::MyType\n";}
~MyType(){ std::cout <<"MyType::~MyType\n";}
};

classOtherType
{
public:
OtherType(){ std::cout <<"OtherType::OtherType\n";}
~OtherType(){ std::cout <<"OtherType::~OtherType\n";}
};

int main()
{
std
::variant<MyType,OtherType> v;
v
=OtherType();

return0;
}

This will produce the output:

MyType::MyType
OtherType::OtherType
MyType::~MyType
OtherType::~OtherType
OtherType::~OtherType

http://coliru.stacked-crooked.com/a/5951ae413e6f731b

At the start, we initialize with a default value of type MyType; then we change the value with an instance of OtherType, and before the assignment, the destructor of MyType is called. Later we destroy the temporary object and the object stored in the variant.

Accessing the Stored Value

From all of the examples, you’ve seen so far you might get an idea how to access the value. But let’s make a summary of this important operation.

First of all, even if you know what’s the currently active type you cannot do:

std::variant<int,float, std::string> intFloatString {"Hello"};
std
::string s = intFloatString;

// error: conversion from
// 'std::variant<int, float, std::string>'
// to non-scalar type 'std::string' requested
// std::string s = intFloatString;

So you have to use helper functions to access the value.

You have std::get<Type|Index>(variant) which is a non member function. It returns a reference to the desired type if it’s active (You can pass a Type or Index). If not then you’ll get std::bad_variant_access exception.

std::variant<int,float, std::string> intFloatString;
try
{
auto f = std::get<float>(intFloatString);
std
::cout <<"float! "<< f <<"\n";
}
catch(std::bad_variant_access&)
{
std
::cout <<"our variant doesn't hold float at this moment...\n";
}

The next option is std::get_if. This function is also a non-member and won’t throw. It returns a pointer to the active type or nullptr. While std::get needs a reference to the variant, std::get_if takes a pointer. I’m not sure why we have this inconsistency.

if(constauto intPtr = std::get_if<0>(&intFloatString))
std
::cout <<"int!"<<*intPtr <<"\n";

However, probably the most important way to access a value inside a variant is by using visitors.

Visitors for std::variant

With the introduction of std::variant we also got a handy STL function called std::visit.

It can call a given “visitor” on all passed variants.

Here’s the declaration:

template<classVisitor,class...Variants>
constexpr visit(Visitor&& vis,Variants&&... vars);

And it will call vis on the currently active type of variants.

If you pass only one variant, then you have to have overloads for the types from that variant. If you give two variants, then you have to have overloads for all possible pairs of the types from the variants.

A visitor is “a Callable that accepts every possible alternative from every variant “.

Let’s see some examples:

// a generic lambda:
autoPrintVisitor=[](constauto& t){ std::cout << t <<"\n";};

std
::variant<int,float, std::string> intFloatString {"Hello"};
std
::visit(PrintVisitor, intFloatString);

In the above example, a generic lambda is used to generate all possible overloads. Since all of the types in the variant supports << then we can print them.

In the another case we can use a visitor to change the value:

autoPrintVisitor=[](constauto& t){ std::cout << t <<"\n";};
autoTwiceMoreVisitor=[](auto& t){ t*=2;};

std
::variant<int,float> intFloat {20.4f};
std
::visit(PrintVisitor, intFloat);
std
::visit(TwiceMoreVisitor, intFloat);
std
::visit(PrintVisitor, intFloat);

Generic lambdas can work if our types share the same “interface”, but in most of the cases, we’d like to do some different actions based on an active type.

That’s why we can define a structure with several overloads for the operator ():

structMultiplyVisitor
{
float mFactor;

MultiplyVisitor(float factor): mFactor(factor){}

voidoperator()(int& i)const{
i
*=static_cast<int>(mFactor);
}

voidoperator()(float& f)const{
f
*= mFactor;
}

voidoperator()(std::string&)const{
// nothing to do here...
}
};

std
::visit(MultiplyVisitor(0.5f), intFloat);
std
::visit(PrintVisitor, intFloat);

In the example, you might notice that I’ve used a state to hold the desired scaling factor value.

With lambdas, we got used to declaring things just next to its usage. And when you need to write a separate structure, you need to go out of that local scope. That’s why it might be handy to use overload construction.

Overload

With this utility you can write all several lambdas for all matching types in one place:

std::visit
(
overload
(
[](constint& i){ PRINT("int: "+ i);},
[](const std::string& s){ PRINT("it's a string: "+ s);},
[](constfloat& f){ PRINT("float"+ f);}
),
yourVariant
;
);

Currently this helper is not part of the library (it might get into with C++20), but the code might look like that:

template<class...Ts>struct overload :Ts...{usingTs::operator()...;};
template<class...Ts> overload(Ts...)-> overload<Ts...>;

Those two lines look like a bit of magic :) But all they do is they create a struct that inherits all given lambdas and uses their Ts::operator(). The whole structure can be now passed to std::visit.

For example:

std::variant<int,float, std::string> intFloatString {"Hello"};
std
::visit(overload{
[](int& i){ i*=2;},
[](float& f){ f*=2.0f;},
[](std::string& s){ s = s + s;}
}, intFloatString);
std
::visit(PrintVisitor, intFloatString);
// prints: "HelloHello"

Play with the code @Coliru

Recently Arne Mertz wrote more about this technique in his recent post:
SimplifyC++ - Overload: Build a Variant Visitor on the Fly -.

And here’s the paper for the proposal of std::overload: P0051 - C++ generic overload function

Also, if you'd like to know how std::visit works underneath, then you might want to check out this post: Variant Visitation by Michael Park

Other std::variant Operations

Just for the sake of completeness:

  • You can compare two variants of the same type:
    • if they contain the same active alternative then the corresponding comparison operator is called.
    • If one variant has an “earlier” alternative then it’s “less than” the variant with the next active alternative.
  • Variant is a value type, so you can move it.
  • std::hash on a variant is also possible.

Exception Safety Guarantees

So far everything looks nice and smooth… but what happens when there’s an exception during the creation of the alternative in a variant?

For example

classThrowingClass
{
public:
explicitThrowingClass(int i){if(i ==0)throwint(10);}
operatorint(){throwint(10);}
};

int main(int argc,char** argv)
{
std
::variant<int,ThrowingClass> v;

// change the value:
try
{
v
=ThrowingClass(0);
}
catch(...)
{
std
::cout <<"catch(...)\n";
// we keep the old state!
std
::cout << v.valueless_by_exception()<<"\n";
std
::cout << std::get<int>(v)<<"\n";
}

// inside emplace
try
{
v
.emplace<0>(ThrowingClass(10));// calls the operator int
}
catch(...)
{
std
::cout <<"catch(...)\n";
// the old state was destroyed, so we're not in invalid state!
std
::cout << v.valueless_by_exception()<<"\n";
}

return0;
}

Play with the code @Coliru

In the first case - with the assignment operator - the exception is thrown in the constructor of the type. This happens before the old value is replaced in the variant, so the variant state is unchanged. As you can see we can still access int and print it.

However, in the second case - emplace - the exception is thrown after the old state of the variant is destroyed. Emplace calls operator int to replace the value, but that throws. After that, the variant is in a wrong state, as we cannot recover.

Also note that a variant that is “valueless by exception” is in an invalid state. Accessing a value from such variant is not possible. That’s why variant::index returns variant_npos, and std::get and std::visit will throw bad_variant_access.

Performance & Memory Considerations

std::variant uses the memory in a similar way to union: so it will take the max size of the underlying types. But since we need something that will know what’s the currently active alternative, then we need to add some more space.

Plus everything needs to honour the alignment rules.

Here are some basic sizes:

std::cout <<"sizeof string: "
<<sizeof(std::string)<<"\n";

std
::cout <<"sizeof variant<int, string>: "
<<sizeof(std::variant<int, std::string>)<<"\n";

std
::cout <<"sizeof variant<int, float>: "
<<sizeof(std::variant<int,float>)<<"\n";

std
::cout <<"sizeof variant<int, double>: "
<<sizeof(std::variant<int,double>)<<"\n";

On GCC 8.1, 32 bit I have:

sizeofstring:32
sizeof variant<int,string>:40
sizeof variant<int,float>:8
sizeof variant<int,double>:16

Play with the code @Coliru

What’s more interesting is that std::variant won’t allocate any extra space! No dynamic allocation happens to hold variants. and the discriminator.

While you pay some extra space for all the type-safe functionality, it shouldn’t cost you regarding runtime performance.

Migration From boost::variant

Boost Variant was introduced around the year 2004, so it was 13 years of experience before std::variant was added into the Standard. The STL type takes from the experience of the boost version and improves it.

Here are the main changes:

FeatureBoost.Variant (1.67.0)std::variant
Extra memory allocationPossible on assignment, see Design Overview - Never EmptyNo
visitingapply_visitorstd::visit
get by indexnoyes
recursive variantyes, see make_recursive_variantno
duplicated entriesnoyes
empty alternativeboost::blankstd::monostate

You can also see the slides from
Variants - Past, Present, and Future - David Sankel - CppCon 2016 Where there is more discussion about the changes and the proposal.

or the video @Youtube

Examples of std::variant

After we learned most of the std::variant details, we can now explore a few examples. So far, the code I used was a bit artificial, but in this section, I tried to look for some real-life examples.

Error Handling

The basic idea is to wrap the possible return type with some ErrorCode, and that way allow to output more information about the errors. Without using exceptions or output parameters. This is similar to what std::expected might be in the future (see more about std::expected here).

enumclassErrorCode
{
Ok,
SystemError,
IoError,
NetworkError
};

std
::variant<std::string,ErrorCode>FetchNameFromNetwork(int i)
{
if(i ==0)
returnErrorCode::SystemError;

if(i ==1)
returnErrorCode::NetworkError;

return std::string("Hello World!");
}

int main()
{
auto response =FetchNameFromNetwork(0);
if(std::holds_alternative<std::string>(response))
std
::cout << std::get<std::string>(response)<<"n";
else
std
::cout <<"Error!\n";

response
=FetchNameFromNetwork(10);
if(std::holds_alternative<std::string>(response))
std
::cout << std::get<std::string>(response)<<"n";
else
std
::cout <<"Error!\n";

return0;
}

Play with the example @Coliru

In the example, I’m returning ErrorCode or a valid type - in this case, a string.

Computing Roots of an Equation

Sometimes the computation might give us several options, for example, real roots of the equation. With variant, we can wrap all the available options and express clearly how many roots can we find.

usingDoublePair= std::pair<double,double>
usingEquationRoots= std::variant<DoublePair,double, std::monostate>;

EquationRootsFindRoots(double a,double b,double c)
{
auto d = b*b-4*a*c;

if(d >0.0)
{
auto p = sqrt(d)/(2*a);
return std::make_pair(-b + p,-b - p);
}
elseif(d ==0.0)
return(-1*b)/(2*a);

return std::monostate();
}

structRootPrinterVisitor
{
voidoperator()(constDoublePair>& arg)
{
std
::cout <<"2 roots: "<< arg.first <<""<< arg.second <<'\n';
}
voidoperator()(double arg)
{
std
::cout <<"1 root: "<< arg <<'\n';
}
voidoperator()(std::monostate)
{
std
::cout <<"No real roots found.\n";
}
};

int main()
{
std
::visit(RootPrinterVisitor{},FindRoots(10,0,-2));
std
::visit(RootPrinterVisitor{},FindRoots(2,0,-1));
}

Play with the code @Coliru

The code is based on Pattern matching in C++17 with std::variant, std::monostate and std::visit

Parsing a Command Line

Command line might contain text arguments that might be interpreted in a few ways:

  • as integer
  • as boolean flag
  • as a string (not parsed)

So we can build a variant that will hold all the possible options.

Here’s a simple version with intand string:

classCmdLine
{
public:
usingArg= std::variant<int, std::string>;

private:
std
::map<std::string,Arg> mParsedArgs;

public:
explicitCmdLine(int argc,char** argv){ParseArgs(argc, argv);}

// ...
};

And the parsing code:

CmdLine::ArgTryParseString(char* arg)
{
// try with int first
int iResult =0;
auto res = std::from_chars(arg, arg+strlen(arg), iResult);
if(res.ec == std::errc::invalid_argument)
{
// if not possible, then just assume it's a string
return std::string(arg);
}

return iResult;
}

voidCmdLine::ParseArgs(int argc,char** argv)
{
// the form: -argName value -argName value
// unnamed? later...
for(int i =1; i < argc; i+=2)
{
if(argv[i][0]!='-')// super advanced pattern matching! :)
throw std::runtime_error("wrong command name");

mParsedArgs
[argv[i]+1]=TryParseString(argv[i+1]);
}
}

At the moment of writing, std::from_chars in GCC only supports integers, in MSVC floating point support is on the way. But the idea of the TryParseString is to try with parsing the input string to the best matching type. So if it looks like an integer, then we try to fetch integer. Otherwise, we’ll return an unparsed string. Of course, we can extend this approach.

Example how we can use it:

try
{
CmdLine cmdLine(argc, argv);

auto arg = cmdLine.Find("paramInt");
if(arg && std::holds_alternative<int>(*arg))
std
::cout <<"paramInt is "
<< std::get<int>(*arg)<<"\n";

arg
= cmdLine.Find("textParam");
if(arg && std::holds_alternative<std::string>(*arg))
std
::cout <<"textParam is "
<< std::get<std::string>(*arg)<<"\n";
}
catch(std::runtime_error &err)
{
std
::cout << err.what()<<"\n";
}

Play with the code @Coliru

Parsing a Config File

I don’t have a code for that, but the idea comes from the previous example of a command line. In the case of a configuration file, we usually work with pairs of <Name, Value>. Where Value might be a different type: string, int, array, bool, float, etc.

In my experience I’ve seen examples where even void* was used to hold such unknown type so we could improve the design by using std::variant if we know all the possible types, or leverage std::any.

State Machines

How about modelling a state machine? For example door’s state:

Door State Machine

We can use different types of states and the use visitors as events:

structDoorState
{
structDoorOpened{};
structDoorClosed{};
structDoorLocked{};

usingState= std::variant<DoorOpened,DoorClosed,DoorLocked>;

void open()
{
m_state
= std::visit(OpenEvent{}, m_state);
}

void close()
{
m_state
= std::visit(CloseEvent{}, m_state);
}

void lock()
{
m_state
= std::visit(LockEvent{}, m_state);
}

void unlock()
{
m_state
= std::visit(UnlockEvent{}, m_state);
}

State m_state;
};

And here are the events:

structOpenEvent
{
Stateoperator()(constDoorOpened&){returnDoorOpened();}
Stateoperator()(constDoorClosed&){returnDoorOpened();}
// cannot open locked doors
Stateoperator()(constDoorLocked&){returnDoorLocked();}
};

structCloseEvent
{
Stateoperator()(constDoorOpened&){returnDoorClosed();}
Stateoperator()(constDoorClosed&){returnDoorClosed();}
Stateoperator()(constDoorLocked&){returnDoorLocked();}
};

structLockEvent
{
// cannot lock opened doors
Stateoperator()(constDoorOpened&){returnDoorOpened();}
Stateoperator()(constDoorClosed&){returnDoorLocked();}
Stateoperator()(constDoorLocked&){returnDoorLocked();}
};

structUnlockEvent
{
// cannot unlock opened doors
Stateoperator()(constDoorOpened&){returnDoorOpened();}
Stateoperator()(constDoorClosed&){returnDoorClosed();}
// unlock
Stateoperator()(constDoorLocked&){returnDoorClosed();}
};

Play with the code using the following example: @Coliru

The idea is based on the blog posts:

Polymorphism

Most of the time in C++ we can safely use runtime polymorphism based on v-table approach. You have a collection of related types - that shares the same interface, and you have a well defined virtual method that can be invoked.

But what if you have “unrelated” types that don’t share the same base class? What if you’d like to quickly add new functionality without changing the code of the supported types?

In such situations, we have a handy pattern of Visitor. I’ve even described in my older post.

With std::variant and std::visit we can build the following example:

classTriangle
{
public:
voidRender(){ std::cout <<"Drawing a triangle!\n";}
};

classPolygon
{
public:
voidRender(){ std::cout <<"Drawing a polygon!\n";}
};

classSphere
{
public:
voidRender(){ std::cout <<"Drawing a sphere!\n";}
};

int main()
{
std
::vector<std::variant<Triangle,Polygon,Sphere>> objects {
Polygon(),
Triangle(),
Sphere(),
Triangle()
};

autoCallRender=[](auto& obj){ obj.Render();};

for(auto& obj : objects)
std
::visit(CallRender, obj);

}

Play with the code: @Coliru

In the above example, I’ve shown only the first case of invoking a method from unrelated types. I wrap all the possible shape types into a single variant and then use a visitor to dispatch the call to the proper type.

If you’d like, for example, to sort objects, then we can write another visitor, that holds some state. And that way you allow to have more functionality without changing the types.

You can explore more about this pattern and its advantages in:
Another polymorphism | Andrzej’s C++ blog and in Inheritance vs std::variant, C++ Truths

Sorry for a little interruption in the flow :)
I've prepared a little bonus if you're interested in C++17, check it out here:

Other Uses

There are many many more example, see this tweet:

You can open this tweet and follow the discussion.

Wrap Up

After reading this post, you should be equipped with all the knowledge required to use std::variant in your projects!

While a similar type has been available for years - in the form of boost.variant - I’m happy to see the official STL version. That way we can expect more and more code that uses this handy wrapper type.

Here are the things to remember about std::variant:

  • It holds one of several alternatives in a type-safe way
  • No extra memory allocation is needed. The variant needs the size of the max of the sizes of the alternatives, plus some little extra space for knowing the currently active value.
  • By default, it initializes with the default value of the first alternative
  • You can assess the value by using std::get, std::get_if or by using a form of a visitor.
  • To check the currently active type you can use std::holds_alternative or std::variant::index
  • std::visit is a way to invoke an operation on the currently active type in the variant. It’s a callable object with overloads for all the possible types in the variant(s).
  • Rarely std::variant might get into invalid state, you can check it via valueless_by_exception

I’d like to thank Patrice Roy (@PatriceRoy1), Mandar Kulkarni (@mjkcool) for finding time to do a review of this article!

See also some other posts about std::variant:

A Wall of Your std::optional Examples

$
0
0

std::optional contest

Two weeks ago I asked you for help: I wanted to build a wall of examples of std::optional. I’m very grateful that a lot of you responded and I could move forward with the plan!

You’re amazing!

Let’s dive in the examples my readers have sent me!

A Reminder

To remind, I asked for some real-life examples of std::optional. It’s exciting to see in how many ways you use this vocabulary type in your projects. There are many options and variations. In this post, I’ve put all of them in a single place.

Most of the code is as I got it from the authors, in some places I had to shorten it and extract only the core parts.

Giveaway

For this experiment, I also had 2 x 25$ Amazon.com Gift Card. I randomly selected two participants, and I’ve contacted them already :)

I wonder if they spend that enormous amount of money on some C++ book or a course :)

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

Resources about C++17 STL:

The Examples

Constructing a Query to a Database

Wojciech Razik used optional to represent possible query parameters:

classQuery{
std
::optional<int> limit_;
std
::optional<std::string> name_;
// ... more params
public:
Query&Limit(int l){ limit_ = l;return*this;}
Query&Name(std::string s){ name_ = std::move(s);return*this;}

std
::optional<int>GetLimit()const{return limit_;}
std
::optional<std::string>GetName()const{return name_;}
};

voidSelect(constQuery& q){// couts for demonstration only
std
::cout <<" - \n";
if(q.GetLimit()){
std
::cout <<"Limit: "<< q.GetLimit().value()<<"\n";
}
if(q.GetName()){
std
::cout <<"Name: "<< q.GetName().value()<<"\n";
}
}

int main(){
Select(Query{}.Name("Some name"));
Select(Query{}.Limit(3));
// You can find objects with empty fields!
Select(Query{}.Limit(5).Name(""));
}

Play with the code @Coliru

I like the idea of chaining to build the final query object.

Conversion from a String to an Integer

In the following example, Martin Moene applied std::optional to a function that converts strings to integers.

auto to_int(charconst*const text )-> std::optional<int>
{
char* pos =nullptr;
constint value = std::strtol( text,&pos,0);

return pos == text ? std::nullopt : std::optional<int>( value );
}

int main(int argc,char* argv[])
{
constchar* text = argc >1? argv[1]:"42";

std
::optional<int> oi = to_int( text );

if( oi ) std::cout <<"'"<< text <<"' is "<<*oi;
else std::cout <<"'"<< text <<"' isn't a number";
}

Alternatively with more compact code:

if(auto oi = to_int( text ))
std
::cout <<"'"<< text <<"' is "<<*oi;
else
std
::cout <<"'"<< text <<"' isn't a number";

Play with the code @Wandbox

Conversion from String, More Generic solution

jft went a bit further with the previous idea of string conversions and wrote a function that uses istringstream to convert to many different numeric types.

// Converts a text number to specified type. 
// All of the text must be a valid number of the specified type.
// eg 63q is invalid
// Defaults to type int
// st - string to convert
// returns either value of converted number or
// no value if text number cannot be converted

template<typename T =int>
std
::optional<T> stonum(const std::string& st)
{
constauto s = trim(st);
bool ok = s.empty()?
false:(std::isdigit(s.front())
||(((std::is_signed<T>::value
&&(s.front()=='-'))
||(s.front()=='+'))
&&((s.size()>1)
&& std::isdigit(s[1]))));

auto v = T {};

if(ok){
std
::istringstream ss(s);

ss
>> v;
ok
=(ss.peek()== EOF);
}

return ok ? v : std::optional<T>{};
}

// use case:
string snum
="42.5";
if(auto n = stonum<double>(snum); n.has_value())
cout
<< snum <<" is double "<<*n << endl;
else
cout
<< snum <<" is not a double"<< endl;

Play with the code @Coliru

std::istream::operator>> has overloads for many numeric types, so with this one handy function you can potentially have a converter to many types from a string.

Monadic Extensions

This snippet comes from Lesley Lai

Full code @Gist

The basic idea is to be able to chain operations that return std::optional.

auto x = read_file("exist.txt")
>> opt_stoi
>>[](int n){return std::make_optional(n +100);};
print
(x);

This is done by clever overloading of >>.

template<typename T1,
typenameFunc,
typenameInput_Type=typename T1::value_type,
typename T2 = std::invoke_result_t<Func,Input_Type>
>
constexpr T2 operator>>(T1 input,Func f){
static_assert(
std
::is_invocable_v<decltype(f),Input_Type>,
"The function passed in must take type"
"(T1::value_type) as its argument"
);

if(!input)return std::nullopt;
elsereturn std::invoke(f,*input);
}

And the functions used in the example:

std::optional<std::string> read_file(constchar* filename){
std
::ifstream file {filename};

if(!file.is_open()){
return{};
}

std
::string str((std::istreambuf_iterator<char>(file)),
std
::istreambuf_iterator<char>());
return{str};
}


std
::optional<int> opt_stoi(std::string s){
try{
return std::stoi(s);
}catch(const std::invalid_argument& e){
return{};
}catch(const std::out_of_range&){
return{};
}
}

template<typename T>
constexprvoid print(std::optional<T> val){
if(val){
std
::cout <<*val <<'\n';
}else{
std
::cerr <<"Error\n";
}
}

Play with the code @Coliru

And the notes from the author:

This snippet implement monadicbind operation that chain functions together without explicitly checking errors. The whole gist is inspired by Phil Nash’s talk at North Denver Metro C++ Meetup C++ meetup.

I use optional here because it is in the standard library, Expected should fit the error handling job better since it stores information about why an error happened. I do not think the implementation of this function will change for Expected.

Geometry and Intersections

by Arnaud Brejeon

Full code @Gist

The original code is much longer and uses operator overloading, plus a separate type declaration Point and Line, but it should be clear what the code does:

std::optional<Point> intersection(constLine& a,constLine& b){
constauto d1 = a.first - a.second;
constauto d2 = b.first - b.second;
constauto cross = d1.x * d2.y - d1.y * d2.x;

if(std::abs(cross)<1e-6f){// No intersection
return{};
}

constauto x = b.first - a.first;
constauto t1 =(x.x * d2.y - x.y * d2.x)/ cross;
return a.first + t1 * d1;
}

Example use case:

constauto i0 = intersection(
Line(Point(-1,0),Point(1,0)),
Line(Point(0,-1),Point(0,1))
);

std
::cout << std::boolalpha << i0.has_value();

if(i0){
std
::cout <<" : "<< i0->x <<", "<< i0->y;
}

Simple optional chaining

by Jeremiah O’Neil

While we can chain optional in many ways, Jeremiah showed a simple way:

int a =//value one;
int b =//value two;

if(optional<int> tmp, x;
(tmp = fa(a))&&(x = fb(b))&&(x = fcd(*tmp,*x))&&(x = fe(*x)))
{
return*x;
}else{
return0;
}

Each of the functions fa, fb, fcd, fe (what awesome names!) returns std::optional. But thanks to the short circuit rules and the evaluation happening from left to right the functions won’t be executed if the previous one fails (when a function returns nullopt.

Play with the code @Coliru

Handling a throwing constructor

Edoardo Morandi managed to wrap a throwing constructor into a wrapper class that instead of throwing allows you to check if the object is initialised or not.

Full code @Compiler Explorer

// A simple struct, without anything special related to exception handling
struct S_impl {
S_impl
()=default;

// This can throw!
S_impl
(std::size_t s): v(s){}

std
::vector<double>& get(){return v;}

private:
std
::vector<double> v;
};

// A (too) simple user interface for S_impl
struct S : std::optional<S_impl>{
template<typename...Args>
// A `noexcept` wrapper to construct the real implementation.
S
(Args&&... args) noexcept :
optional
<S_impl>(
// Construct std::optional inplace using constructor initialization,
// leading to pre-C++20 ugly code to universal forwarding :(
[args = std::tuple<Args...>(std::forward<Args>(args)...)]()mutable{
return std::apply([](auto&&... args)-> std::optional<S_impl>{
try{
return std::optional<S_impl>(std::in_place, std::forward<Args>(args)...);
}catch(...){
return std::nullopt;
}
}, std::move(args));
}()
)

{
}
};

The code converts a class with a throwing constructor to a wrapper class that won’t throw. Such wrapper derives from std::optional<T> so you can directly check if the value is there or not.

Getting File contents

by Michael Cook

full code @Coliru

std::optional<std::string>
get_file_contents
(std::string const& filename)
{
std
::ifstream inf{filename};
if(!inf.is_open())
return std::nullopt;
return std::string{std::istreambuf_iterator<char>{inf},{}};
}

int main()
{
if(auto stat = get_file_contents("/proc/self/stat"))
std
::cout <<"stat "<<*stat <<'\n';
else
std
::cout <<"no stat\n";

if(auto nsf = get_file_contents("/no/such/file"))
std
::cout <<"nsf "<<*nsf <<'\n';
else
std
::cout <<"no nsf\n";
}

Haskell’s listToMaybe

From Zachary

Full code @Compiler Explorer

template<typename T>
usingOpt= std::optional<T>;

using std::begin;

// listToMaybe :: [T] -> Opt<T>
template<typename T,template<typename>typenameCont>
auto listToMaybe(Cont<T>const& xs )->Opt<T>
{
return xs.empty()?Opt<T>{}:Opt<T>{*( begin( xs ))};
}

auto f()
{
auto as = std::vector<int>{};
std
::cout << listToMaybe( as ).value_or(0)<<'\n';// 0
}

Haskell listToMaybe documentation.

Cleaner interface for map.find

Vincent Zalzal make a simple, yet handy extension to .std::map Rather than checking for map::end you can use optional.

the full code @Coliru

// Provide an iterator-free interface for lookups to map-like objects.
// Warning: the output value is copied into the optional.
template<typenameMap,typenameKey>
auto lookup(constMap& m,constKey& k)
{
auto it = m.find(k);
return it != m.end()
? std::make_optional(it->second)
: std::nullopt;
}

int main()
{
const std::map<int,int> squares ={{1,1},{2,4},{3,9},{4,16}};

// cleaner, no need for != end()
if(constauto square = lookup(squares,2))
{
std
::cout <<"Square is "<<*square <<'\n';
}
else
{
std
::cout <<"Square is unknown.\n";
}
}

Comparing against map::end is sometimes ugly, so wrapping the search into optional looks nice.

I wonder if there are plans to apply optional/variant/any to API in STL. Some overloads would be an excellent addition.

Configuration of a Nuclear Simulation

This comes from Mihai Niculescu who used optional in the configuration of a nuclear simulator.

classParticleRecord
{
friend std::istream&operator>>(std::istream& is,
ParticleRecord& p);
public:
double x()const{return x;}
double y()const{return y;}
double z()const{return z;}
double px()const{return px;}
double py()const{return py;}
double pz()const{return pz;}
double mass()const{return mass;}

const std::optional<extendedInfo>& extendedInfo()const
{return extendedData;}

private:
void setExtended(double tdecay,double tformation,long uniqueId)
{
extendedInfo einfo
;
einfo
.tdec = tdecay;
einfo
.tform= tformation;
einfo
.uid = uniqueId;

extendedData
= einfo;
}

double x, y, z;// position (x,y,z)
double px, py, pz;// momentum (px, py, pz)
double mass;// mass

// extended data is available when Sim's parameter 13 is ON
std
::optional<extended_fields> extendedData;
};

A natural choice for values that might not be available. Here, if the extendedData is loaded, then the simulation will behave differently.

Factory

This comes from Russell Davidson.

using namelist = std::vector<std::string>;

template<typenameProduct>
struct basicFactory : boost::noncopyable
{
virtual~basicFactory(){}
virtualbool canProduce(const std::string& st)const=0;
virtual std::optional<Product> produce(const std::string& st)
const=0;
virtual namelist keys()const=0;
};

template<typename T,
typenameRetType,
typenameProduct,
typenameConverter>
class objFactory :public basicFactory<Product>,publicConverter
{
constData::Lookup<T,RetType>* tbl_;

public:
objFactory
(constData::Lookup<T,RetType>* tbl): tbl_(tbl){}
bool canProduce(const std::string& key)const
{
return tbl_->ismember(key);
}

std
::optional<Product> produce(const std::string& key)const
{
RetType ret = tbl_->find(key);
if(!ret)return std::nullopt;
return std::make_optional<Product>(Converter::convert(ret));
}

namelist keys
()const{return tbl_->keys();}
};

The key method is std::optional<Product> produce(const std::string& key) const which returns a created Products or nullopt.

Summary

Once again thanks for all of the submissions. There are many ways how you can use a particular helper type - in this case std::optional. By looking at real-life examples, you can hopefully learn more.

Do you have any comments regarding the examples? Would you suggest some changes/improvements? Let us know.

Everything You Need to Know About std::any from C++17

$
0
0

Using std::any in C++17

With std::optional you can represent some Type or nothing. With std::variant you can wrap several variants into one entity. And C++17 gives us one more wrapper type: std::any that can hold anything in a type-safe way.

The Basics

So far in the Standard C++, you had not many options when it comes to holding variable types in a variable. Of course, you could use void*, yet this wasn’t super safe.

Potentially, void* could be wrapped in a class with some type discriminator.

classMyAny
{
void* _value;
TypeInfo _typeInfo;
};

As you see, we have some basic form of the type, but it’s a bit of coding required to make sure MyAny is type-safe. That’s why it’s best to use the Standard Library rather than rolling a custom implementation.

And this is what std::any from C++17 is in its basic form. It gives you a chance to store anything in an object, and it reports errors (or throw exceptions) when you’d like to access a type that is not active.

A little demo:

std::any a(12);

// set any value:
a
= std::string("Hello!");
a
=16;
// reading a value:

// we can read it as int
std
::cout << std::any_cast<int>(a)<<'\n';

// but not as string:
try
{
std
::cout << std::any_cast<std::string>(a)<<'\n';
}
catch(const std::bad_any_cast& e)
{
std
::cout << e.what()<<'\n';
}

// reset and check if it contains any value:
a
.reset();
if(!a.has_value())
{
std
::cout <<"a is empty!"<<"\n";
}

// you can use it in a container:
std
::map<std::string, std::any> m;
m
["integer"]=10;
m
["string"]= std::string("Hello World");
m
["float"]=1.0f;

for(auto&[key, val]: m)
{
if(val.type()==typeid(int))
std
::cout <<"int: "<< std::any_cast<int>(val)<<"\n";
elseif(val.type()==typeid(std::string))
std
::cout <<"string: "<< std::any_cast<std::string>(val)<<"\n";
elseif(val.type()==typeid(float))
std
::cout <<"float: "<< std::any_cast<float>(val)<<"\n";
}

The code will output:

16
bad any_cast
a
is empty!
float:1
int:10
string:HelloWorld

Play with the code @Coliru

We have several things showed in the example above:

  • std::any is not a template class like std::optional or std::variant.
  • by default it contains no value, and you can check it via .has_value().
  • you can reset an any object via .reset().
  • it works on “decayed” types - so before assignment, initialization, emplacement the type is transformed by std::decay.
  • when a different type is assigned, then the active type is destroyed.
  • you can access the value by using std::any_cast<T>, it will throw bad_any_cast if the active type is not T.
  • you can discover the active type by using .type() that returns std:: type_info of the type.

The above example looks impressive - a true variable type in C++!. If you like JavaScript then you can even make all of your variables std::any and use C++ like JavaScript :)

But maybe there are some legitimate use cases?

When to Use

While I perceive void* as an extremely unsafe pattern with some limited use cases, std::any adds type-safety, and that’s why it has some real use cases.

Some possibilities:

  • In Libraries - when a library type has to hold or pass anything without knowing the set of available types.
  • Parsing files - if you really cannot specify what are the supported types.
  • Message passing.
  • Bindings with a scripting language.
  • Implementing an interpreter for a scripting language
  • User Interface - controls might hold anything
  • Entities in an editor

I believe in a lot of cases we can limit the set of supported types, and that’s why std::variant might be a better choice. Of course, it gets tricky when you implement a library without knowing the final applications - so you don’t know the possible types that will be stored in an object.

The demo showed some basics, but in the following sections, you’ll discover more details about std::any so read on.

The Series

This article is part of my series about C++17 Library Utilities. Here’s the list of the other topics that I’ll cover:

Resources about C++17 STL:

std::any Creation

There are several ways you can create std::any object:

  • a default initialization - then the object is empty
  • a direct initialization with a value/object
  • in place std::in_place_type
  • via std::make_any

You can see it in the following example:

// default initialization:
std
::any a;
assert
(!a.has_value());

// initialization with an object:
std
::any a2(10);// int
std
::any a3(MyType(10,11));

// in_place:
std
::any a4(std::in_place_type<MyType>,10,11);
std
::any a5{std::in_place_type<std::string>,"Hello World"};

// make_any
std
::any a6 = std::make_any<std::string>("Hello World");

Play with the code @Coliru

Changing the Value

When you want to change the currently stored value in std::any then you have two options: use emplace or the assignment:

std::any a;

a
=MyType(10,11);
a
= std::string("Hello");

a
.emplace<float>(100.5f);
a
.emplace<std::vector<int>>({10,11,12,13});
a
.emplace<MyType>(10,11);

Play with the code @Coliru

Object Lifetime

The crucial part of being safe for std::any is not to leak any resources. To achieve this behaviour std::any will destroy any active object before assigning a new value.

std::any var = std::make_any<MyType>();
var
=100.0f;
std
::cout << std::any_cast<float>(var)<<"\n";

Play with the code @Coliru

This will produce the following output:

MyType::MyType
MyType::~MyType
100

The any object is initialized with MyType, but before it gets a new value (of 100.0f) it calls the destructor of MyType.

Accessing The Stored Value

In order to read the currently active value in std::any you have mostly one option: std::any_cast. This function returns the value of the requested type if it’s in the object.

However, this function template is quite powerful, as it has many ways of using:

  • to return a copy of the value, and throw std::bad_any_cast when it fails
  • to return a reference (also writable), and throw std::bad_any_cast when it fails
  • to return a pointer to the value (const or not) or nullptr on failure

See the example

structMyType
{
int a, b;

MyType(int x,int y): a(x), b(y){}

voidPrint(){ std::cout << a <<", "<< b <<"\n";}
};

int main()
{
std
::any var = std::make_any<MyType>(10,10);
try
{
std
::any_cast<MyType&>(var).Print();
std
::any_cast<MyType&>(var).a =11;// read/write
std
::any_cast<MyType&>(var).Print();
std
::any_cast<int>(var);// throw!
}
catch(const std::bad_any_cast& e)
{
std
::cout << e.what()<<'\n';
}

int* p = std::any_cast<int>(&var);
std
::cout <<(p ?"contains int... \n":"doesn't contain an int...\n");

MyType* pt = std::any_cast<MyType>(&var);
if(pt)
{
pt
->a =12;
std
::any_cast<MyType&>(var).Print();
}
}

Play with the code @Coliru

As you see you have two options regarding error handling: via exceptions (std::bad_any_cast) or by returning a pointer (or nullptr). The function overloads for std::_any_cast pointer access is also marked with noexcept.

Performance & Memory Considerations

std::any looks quite powerful and you might use it to hold variables of variable types… but you might ask what’s the price of such flexibility?

The main issue: extra dynamic memory allocations.

std::variant and std::optional don’t require any extra memory allocations but this is because they know which type (or types) will be stored in the object. std::any has no knowledge and that why it might use some heap memory.

Will it happen always, or sometimes? What’re the rules? Will it happen even for a simple type like int?

Let’s see what the standard says:

From The Standard:

Implementations should avoid the use of dynamically allocated memory for a small contained value. Example: where the object constructed is holding only an int. Such small-object optimization shall only be applied to types T for which is_nothrow_move_constructible_v<T> is true.

To sum up: Implementations are encouraged to use SBO - Small Buffer Optimization. But that also comes at some cost: it will make the type larger - to fit the buffer.

Let’s check what’s the size of std::any:

Here are the results from the three compilers:

Compilersizeof(any)
GCC 8.1 (Coliru)16
Clang 7.0.0 (Wandbox)32
MSVC 2017 15.7.0 32-bit40
MSVC 2017 15.7.0 64-bit64

Play with code @Coliru

In general, as you see, std::any is not a “simple” type and it brings a lot of overhead. It’s usually not small - due to SBO - it takes 16 or 32 bytes (GCC or Clang… or even 64 bytes in MSVC!)

Migration from boost::any

Boost Any was introduced around the year 2001 (version Version 1.23.0). What’s more the author of the boost library - Kevlin Henney - is also the author of the proposal for std::any. So the two types are strongly connected, and the STL version is heavily based on the predecessor.

Here are the main changes:

FeatureBoost.Any (1.67.0)std::any
Extra memory allocationYesYes
Small buffer optimizationNoYes
emplaceNoYes
in_place_type_t in constructorNoYes

The main difference is that boost.any doesn’t use SBO, so it’s much smaller type (GCC8.1 reports 8 bytes), but as the consequence, it will allocate a memory even for simple types, like int.

Examples of std::any

The core of std::any is flexibility. So In the below examples, you can see some ideas (or concrete implementations) where holding variable type can make an application a bit simpler.

Parsing files

In the examples about std::variant (see it here) you could see how it’s possible to parse config files and store the result as an alternative of several types. Yet, if you write a really generic solution - maybe as a part of some library, then you might not know all the possible types.

Storing std::any as a value for a property might be good enough from the performance point of view and will give you flexibility.

Message Passing

In Windows Api, which is C mostly, there’s a message passing system that uses message ids with two optional parameters that store the value of the message. Based on that mechanism you can implement WndProc that handles the messages passed to your window/control:

LRESULT CALLBACK WindowProc(
_In_ HWND hwnd,
_In_ UINT uMsg,
_In_ WPARAM wParam,
_In_ LPARAM lParam
);

The trick here is that the values are stored in wParam or lParam in various forms. Sometimes you have to use only a few bytes of wParam

What if we changed this system into std::any, so that a message could pass anything to the handling method?

For example:

classMessage
{
public:
enumclassType
{
Init,
Closing,
ShowWindow,
DrawWindow
};

public:
explicitMessage(Type type, std::any param):
mType
(type),
mParam
(param)
{}
explicitMessage(Type type):
mType
(type)
{}

Type mType;
std
::any mParam;
};

classWindow
{
public:
virtualvoidHandleMessage(constMessage& msg)=0;
};

For example you can send a message to a window:

Message m(Message::Type::ShowWindow, std::make_pair(10,11));
yourWindow
.HandleMessage(m);

Then the window can respond to the message like:

switch(msg.mType){
// ...
caseMessage::Type::ShowWindow:
{
auto pos = std::any_cast<std::pair<int,int>>(msg.mParam);
std
::cout <<"ShowWidow: "
<< pos.first <<", "
<< pos.second <<"\n";
break;
}
}

Play with the code @Coliru

Of course, you have to define how are the values specified (what are the types of a value of a message), but now you can use real types rather that doing various tricks with integers.

Properties

The original paper that introduces any to C++, N1939 shows an example of a property class.

structproperty
{
property();
property(const std::string &,const std::any &);

std
::string name;
std
::any value;
};

typedef std::vector<property> properties;

The properties object looks very powerful as it can hold many different types. As a first use case a generic UI manager comes to my mind, or a game editor.

Passing across boundaries

Some time ago there was a thread on [r/cpp](
https://www.reddit.com/r/cpp/comments/7l3i19/why_was_stdany_added_to_c17/
) about std::any. And there was at least one great comment that summarises when the type should be used:

From the comment:

The general gist is that std::any allows passing ownership of arbitrary values across boundaries that don’t know about those types.

Everything that I mentioned before is close to this idea:

  • in a UI library: you don’t know what the final types that a client might use are
  • message passing: same idea, you’d like to have the flexibility for the client
  • parsing files: to support custom types a really “variable” type could be useful

Sorry for a little interruption in the flow :)
I've prepared a little bonus if you're interested in C++17, check it out here:

Wrap Up

In this article, we covered a lot about std::any!

Here are the things to remember about std::any:

  • std::any is not a template class
  • std::any uses Small Buffer Optimization, so it will not dynamically allocate memory for simple types like ints, doubles… but for larger types it will use extra new.
  • std::any might be considered ‘heavy’, but offers a lot of flexibility and type-safety.
  • you can access the currently stored value by using any_cast that offers a few “modes”: for example it might throw an exception or just return nullptr.
  • use it when you don’t know the possible types, in other cases consider std::variant.

Now a few questions to you:

  • Have you used std::any or boost::any?
  • Can you mention what the uses cases were?
  • Where do you see std::any might be useful?
Viewing all 325 articles
Browse latest View live